Both get pitched the same way: "fix production without waiting for App Store review." Both deliver on that pitch — for entirely different failure modes. Teams that conflate them end up insured against the wrong incident.

The difference: a feature flag switches between code paths that already shipped in the app's binary, while an OTA code update delivers new code to installed apps. A flag can only select behavior someone wrapped in advance; an OTA update can fix code nobody thought to protect. Everything else — the costs, the failure modes, the org questions — falls out of that one distinction.

Feature flagsOTA code updates
What changesWhich shipped path runsThe code itself
Takes effectNext config refresh (seconds–hours)Next app launch (payload download)
CoverageOnly paths wrapped in advanceAnything in the updatable layer
Apple policy surfaceMinimal — both paths went through reviewConditional (interpreted-code rules)
Ongoing costFlag debt, dead branches in the binarySDK + interpreter + update pipeline

When to use feature flags (and where they fail)

A feature flag is a remote boolean (or variant assignment) the app checks at runtime. Wrap the new checkout flow in one, ship it dark, turn it on for 5% of users, watch the metrics, ramp up. Something breaks? Flip it off — and every device is back on the old path at its next config refresh: seconds on a streaming SDK, up to hours on default polling setups (Firebase Remote Config's default fetch interval is 12 hours unless you enable real-time updates).

That's a great insurance policy, with properties OTA can't match:

  • Fast, fleet-wide, and boring. No payload, no new code on the device, nothing for an interpreter to run. Toggling a flag is about the lowest-risk intervention in mobile — the residual risk being that a flip puts old binaries into state combinations you never tested together.

  • It doubles as a product tool. Gradual ramps, A/B tests, entitlement gates — the same machinery.

  • Almost review-proof. Both code paths went through App Review; the flag chooses between reviewed behaviors. One caveat: Apple's Guideline 2.3.1 bars hidden or dormant features and expects new functionality to be accessible to review — so "ship it dark, reveal it later" belongs in your review notes, not concealed from them.

And its constraints, which are structural rather than fixable:

  • The escape hatch must predate the fire. A flag covers exactly the paths someone thought to wrap. The bug that actually hits production has a habit of living in code nobody flagged — the date parser, the migration, the "trivial" refactor.

  • "Off" needs somewhere to go. A kill switch is only useful if disabling the path leaves a working app. If the old path was deleted, or the bug sits in shared code under both paths, the flag flips and nothing improves.

  • Feature flag debt is real. Every flag is a fork in your codebase: dead branches that ship in the binary forever (binary size, attack surface, test matrix) until someone does the cleanup nobody is excited about. Post-mortems are full of stale feature flag incidents — most famously Knight Capital's $440M morning: 45 minutes of trading on August 1, 2012, where a repurposed flag re-activated eight-year-old dead code on the one server an incomplete deploy had missed.

Feature flags vs remote config

Remote config is the same idea one level down — values instead of branches (copy, timeouts, layout parameters). Same strengths, same structural limit: it parameterizes code that exists; it creates nothing.

What OTA updates actually buy you

An OTA code update ships different code to installed apps: a new JavaScript bundle (EAS Update, for React Native), a Dart patch (Shorebird, which interprets only the changed functions and leaves the rest running as the binary's native build), or a compiled-Swift WebAssembly module (Patch, for native iOS). The app's signed binary doesn't change; an interpreter inside it executes the downloaded code — the architecture all these tools share, though not identically: JS bundles execute in OS-provided engines, while the Dart and WASM approaches ship their own interpreter inside the binary, a distinction worth knowing when you read Apple's rules.

What that buys:

  • It fixes bugs that have no alternate path. Can a flag fix a crash in unwrapped code? No — but an OTA update can: the null-dereference in the parser, the off-by-one in the discount calculation. You write the fix, you ship it, devices pick it up at next launch — no review cycle, and an adoption curve measured in hours of app-opens instead of weeks of store updates. (The worst case still exists: a crash early enough in launch can prevent the update check from ever running.)

  • Coverage is by capability, not by foresight. Anything in the updatable layer is fixable, whether or not anyone predicted it would break. That inverts the flag model: insurance against unknown future bugs instead of known risky launches.

  • It carries its own rollout machinery. The major tools do staged percentage rollouts, channels, and rollback — flag-like operational control, applied to code.

And its constraints, equally structural:

  • A boundary you must respect. Each ecosystem's updatable layer has a hard edge — native modules for React Native, native plugins for Flutter, OS-API-touching code for native Swift. Bugs below the line still need a store release.

  • A policy surface. Downloaded interpreted code is conditionally permitted by Apple's rules, and the conditions (covered here) bind the developer — a boundary Apple has enforced before, most visibly the 2017 sweep that ended JSPatch-style native hot-patching. Flags simply don't have this dimension at the same scale.

  • More moving parts. An SDK in the binary, an interpreter executing your fix, an update pipeline to trust. The tools mitigate with integrity verification and automatic fallback to the bundled code, but "we ship code outside the release train" is a process change, not just a dependency.

Feature flags vs OTA updates: which to use when

SituationReach for
Launching something risky next sprintFlag it before it ships
Tuning copy, thresholds, layout valuesRemote config
Running an experimentFlags/experimentation platform
Crash in unflagged code, store build liveOTA update — or expedited review + adoption wait
Bug in native/OS-API codeStore release; nothing else can touch it
"Turn it off NOW" on a flagged pathThe flag — nothing beats it on its own turf

The asymmetry worth internalizing: flags are cheap insurance against the risks you can name, OTA is heavier insurance against the ones you can't. A flag costs an if statement and some debt; it pays out fast but only on pre-declared paths. An OTA lane costs an SDK and a policy review; it pays out on the long tail of bugs nobody saw coming — which, if your post-mortems look like most teams', is where the worst incidents actually live.

That's also why this isn't really a "versus." Plenty of teams end up running both, by time horizon: flags for the risks ahead of you, OTA for the ones behind you, and a store release cadence that neither one is trying to replace.


Related: The iOS hotfix playbook · How OTA updates work on iOS · What Apple actually allows · The 2026 OTA landscape