Francisco José García Navarro

Published: June 8, 2026

I've reviewed AI-written iOS code: what actually breaks in production

Screen with AI-generated Swift code during an iOS code audit

" What fails when an AI writes your iOS app, why it compiles anyway, and how to audit it before you ship. The view of an architect who's spent years with his hands in other people's code. "

Key takeaways

AI gets you to 70% of the solution; the remaining 30% (error handling, concurrency, security, architectural fit) is where production lives.
Vibe-coded code looks finished: it compiles, passes tests and runs in the simulator. It fails when the network drops, memory runs out, or there are thousands of users.
The four fragile zones: architecture with no layers, silenced concurrency, mismanaged SwiftUI state, and unencrypted secrets.
Before your users comes App Review: compiling doesn't mean Apple will approve it.

Let me start with a confession that's no taboo here: I use AI to code every single day. I came from ChatGPT and, since Claude Code launched, I switched without looking back. On my personal projects it saves me hours, gets me unstuck, writes the repetitive code I don't want to write. I'm not the type who'll tell you AI is bad. It's a brilliant tool.

But there's a huge difference between using an AI to code and letting an AI code for you. And that difference doesn't show on day one. It shows three months in, when the app that compiled, launched and "worked" turns into a wall nobody can move.

I've seen it several times now. A founder comes to me —usually someone non-technical, with a solid product idea— who has built their iOS app with an AI. They have an MVP that works in the demo. They want to scale: add features, bring in more people, support more users. And they can't. The code resists them. Every change breaks three things. Nobody understands why the app does what it does. They've hit the ceiling, and the ceiling was far lower than it looked.

If you're a Tech Lead about to inherit —or you've already inherited— an iOS codebase generated with AI, here's what you'll actually find. No theory. Concrete patterns that break in production, ordered by where they do the most damage.

The productivity mirage: why vibe-coded code looks finished and isn't

Andrej Karpathy coined "vibe coding" in early 2025: programming by going with the flow, forgetting the code even exists. He said it, importantly, for throwaway weekend projects. The problem starts when that same flow —prompt it, accept it, move on— is used for something headed to production that will have to be maintained.

Addy Osmani, who leads developer experience on Google Chrome, summed it up better than anyone with what he calls the 70% problem: AI gets you with astonishing ease to 70% of the solution. The last 30% —edge cases, error handling, integration with production systems, security, API keys— is exactly where production lives. And on iOS I'd add three more that weigh especially heavily: accessibility, concurrency safety and architectural fit. In my experience, for a senior profile, closing that 30% by reviewing someone else's code is often slower than having written it yourself.

That 70% is exactly what fools the non-technical founder. The app compiles. It passes the tests the AI itself wrote. It does what it's supposed to in the simulator. Every signal that person knows how to read says "done". But the missing 30% doesn't error out in the demo: it errors out when the network drops, when the device runs out of memory, when there are 10,000 users instead of one, when two threads touch the same data at once. The mirage isn't that the code is badly written. It's that it looks complete without being so.

Martin Fowler has an image I find perfect for this: AI produces "the average of the internet", an aggregation of patterns from millions of repos. It doesn't produce code that fits your architecture or your conventions. It produces the most probable thing. And the most probable thing, on iOS, tends to be code shaped like UIKit in a SwiftUI project, MVC in an MVVM codebase, and singletons everywhere.

Let's get to the four zones where this takes shape.

From here on I go into technical detail, because I'd rather show you what breaks than ask you to take my word for it. If you're a founder or product lead and you don't write Swift, you don't need to follow every code example: take away the idea of each section and, if you like, skip straight to when an external audit is worth it, where I translate all of this into signals you'll recognise without opening Xcode —every change breaks things that seemed unrelated, nobody on your team can explain why the app behaves the way it does, or you handle sensitive data and nobody has reviewed the security—. If you're technical, stay with me.

Zone	What the AI delivers	What fails in production
Architecture	The view talks straight to the network, no layers	Impossible to test and scale; crash on the first timeout
Concurrency & memory	`nonisolated(unsafe)` and `@MainActor` to silence Swift 6	Live data races, irreproducible crashes, memory leaks
SwiftUI state	`@ObservedObject` created inside the view	State that's lost at random when the view is recreated
Security	Tokens in `UserDefaults`, hardcoded secrets	Sensitive data in plain text, legal exposure under GDPR

Architecture: the first thing the AI ignores

The number one pattern I find: the view talks straight to the network. No service layer, no repository, no separation. The View does the URLSession, parses the JSON, and puts the business logic in the body. It works. It's impossible to test, impossible to reuse and impossible to maintain.

// What the AI writes when you ask for "a screen that shows the orders"
struct OrdersView: View {
    @State private var orders: [Order] = []

    var body: some View {
        List(orders) { order in
            Text(order.title)
        }
        .task {
            // Networking, parsing and logic, all inside the view
            let url = URL(string: "https://api.example.com/orders")!
            let (data, _) = try! await URLSession.shared.data(from: url)
            orders = try! JSONDecoder().decode([Order].self, from: data)
        }
    }
}

Notice the two try! as well. The AI strips out error handling because it's what makes the code "work" faster. In the demo the network never fails. In production, the first timeout is a crash.

On top of this come things Paul Hudson documents very well in his round-up of what to fix in AI-generated Swift code: dumping dozens of types into a single file —a guaranteed way to get endless build times—, overusing GeometryReader with fixed frames where they don't belong, splitting views into computed properties instead of separate views (which breaks the smart invalidation of @Observable), and using onTapGesture where a Button belongs —a textbook accessibility failure, because VoiceOver doesn't treat it as an interactive element—.

And a classic of the vibe-coded projects I inherit: the AI creates a new networking class every time it needs one, instead of reusing the client that already existed. You end up with four different ways of making an HTTP request in the same app. Each with its own error handling, or none at all.

None of this shows up in the demo. All of it shows up when you try to grow. This is where the founder gets stuck and where they need a senior iOS engineer who integrates into your team to rebuild the foundations without throwing away the product.

Concurrency and memory in Swift: the failures you don't see until production

This is, by far, the most fragile zone. And it's specific to Swift, because Swift 6's strict concurrency —if you've seen what's new in Swift 6.2— forces decisions that AIs get wrong with worrying frequency.

Paul Hudson describes it almost like a mechanical tic: when the AI hits a concurrency problem, you see DispatchQueue.main.async appear an unreasonable number of times, resurrected from pre-async/await days. And Task.sleep(nanoseconds:) where the modern Task.sleep(for:) should go.

But the pattern that worries me most, the one that genuinely makes me raise an eyebrow in an audit, is this:

// The AI hits a Swift 6 data race error and "solves" it like this
nonisolated(unsafe) static var shared = AppState()

nonisolated(unsafe) and @preconcurrency are legitimate escape hatches for very specific cases. But the AI uses them to silence the compiler error, not to fix the problem the compiler was flagging. And here's the key detail you need to understand as a Tech Lead: Swift 6's strict concurrency is your safety net. It detects real data races at compile time, before they reach production. When the AI drops in a nonisolated(unsafe) so it compiles, it isn't solving anything: it's cutting the safety net and leaving the data race alive, waiting to surface as an impossible-to-reproduce crash the moment there's real concurrency.

Same with @MainActor: you'll see it slapped on entire classes "to make the compiler shut up", forcing work onto the main thread that has no business being there. And the classic retain cycles in closures keep showing up —self captured strongly where [weak self] belonged—, with their corresponding memory leaks.

Donny Wals, who has written the book on Swift concurrency and consults on Swift 6 migrations, sums it up bluntly: an AI with good guardrails generates reasonable code for you, sensible layouts, decent flows. But it doesn't know what it feels like to use your app in the real world. It can generate slop very fast. And you don't want to build slop.

And for those who don't code: these are the crashes your team can't reproduce, the ones that come and go in the hands of real users and end up as one-star reviews with nobody able to point to the cause. It's not an engineering detail: it's support on fire and users walking away.

State management in SwiftUI: the pattern the AI gets wrong

If there's one mistake the AI makes over and over in SwiftUI, it's confusing when a view owns an object and when it merely observes it. Sounds trivial. It isn't.

// The mistake that sails through code review
struct WatchlistView: View {
    @ObservedObject private var viewModel = WatchlistViewModel()  // ⚠️
    // ...
}

Antoine van der Lee (SwiftLee) explains it clearly: it's unsafe to create an @ObservedObject inside a view, because SwiftUI can create and recreate views at any time. This is where @StateObject belongs, the one that guarantees the object survives the view's rebuilds. With @ObservedObject, the view model is recreated from scratch at unpredictable moments and the state is lost at random.

And here's the dangerous part, the thing that makes it a time bomb: it works during light testing. It works in the demo. And then it fails at random in production. It's subtle enough to pass a code review without anyone noticing.

The second pattern is treating the @Observable macro (iOS 17+) as a drop-in replacement for ObservableObject. It isn't. Jesse Squires documented a real memory-leak case precisely because of this: since @Observable is used with @State and not @StateObject, its initialiser fires on every view rebuild. The result was model instances piling up in memory indefinitely, all alive, all observing system notifications and competing to write to UserDefaults. On reopening the app, it loaded essentially random data. Migrating @Observable "by brute force" without understanding the difference in initialisation isn't a refactor: it's a bug waiting to happen.

In practice, this is a user losing what they were doing, settings that reset on their own, data showing up wrong when the app reopens. Intermittent failures, impossible to reproduce on the first try, that erode trust just as you start having real users.

Security and data: what slips through when nobody reviews

The AI optimises for the thing to work, and the shortest route to working is almost never the secure route. What I find, in order of frequency:

Hardcoded API keys and secrets in the source code. Extractable from the compiled .ipa by anyone with five minutes and curiosity.
Tokens in UserDefaults instead of the Keychain. UserDefaults isn't encrypted and is read straight from a device backup. Your user's session token, in plain text, in a backup.

// What the AI writes
UserDefaults.standard.set(authToken, forKey: "authToken")  // ⚠️ unencrypted

// What it should be: Keychain, with the right accessibility
// kSecAttrAccessibleWhenUnlockedThisDeviceOnly, not kSecAttrAccessibleAlways

No certificate pinning, and ATS exceptions (NSAllowsArbitraryLoads) added to make a development endpoint work and never removed.
PII in the logs, written with print or os_log at default visibility, visible to anyone with the device connected.

There's no serious quantitative study specific to iOS on this, and I'd rather tell you that than invent a figure. But the analyses on AI-generated code in general are consistent: more writing speed, and proportionally more security findings. The phrase that captures it best is from an Apiiro analysis of enterprise repos: AI fixes the typos and creates the time bombs. In an app that handles European users' personal data, that isn't technical debt: it's direct legal exposure under the GDPR.

Before production comes App Review: what Apple rejects even when it compiles

There's a layer of the mirage that arrives even before your users touch the app: App Review. Something compiling and launching in the simulator doesn't mean Apple will approve it.

An adjacent case that's also reached me: apps Apple rejects outright. In my experience, the rejections I've seen came from apps that weren't native —web wrappers, basically— and clashed with Guideline 4.2 (Minimum Functionality). Apple's criterion is clear: an app has to offer something more than a repackaged web page. If the experience isn't different enough from opening Safari, it's out. And note, this doesn't mean you can't use web technology or cross-platform frameworks; it means the result has to feel like a genuinely native app, with navigation, notifications, offline behaviour and real integration into the system.

AI-generated native code doesn't usually fall foul of 4.2 —it's native— but it can fall into rejections for other, equally avoidable reasons, and this is where it connects with everything above. The AI optimises for "works in the simulator", not "passes App Review". That translates into use of private or already-deprecated APIs the reviewer spots, interface patterns that ignore the Human Interface Guidelines (section 4.0, Design, of the guidelines), and a complete absence of accessibility —remember the onTapGesture instead of Button we talked about: as well as breaking VoiceOver, it's the kind of detail that can cost you a rejection—.

The underlying pattern is the same as ever: the AI hands you something that looks finished, and "finished" for Apple includes a pile of things the AI doesn't know it has to check.

How to audit AI-generated iOS code before you ship

If you're going to inherit a vibe-coded codebase, this is the minimum run-through I do before calling anything good. It works just as well for validating your own AI-generated code.

Look for the compiler silencers. A grep for nonisolated(unsafe) and @preconcurrency. Every occurrence is a red flag: someone (or something) made Swift 6's strict concurrency go quiet instead of solving the problem. Treat them as priority one.
Audit secret storage. Search UserDefaults for anything that smells of a token, session or credential. Search for hardcoded Bearer, API_KEY, sk_, Authorization. Search for NSAllowsArbitraryLoads.
Review state ownership. Every @ObservedObject initialised inside a view (= SomeViewModel()) is almost always a misplaced @StateObject. Every @Observable migration done without thinking about initialisation is suspect.
Look at the layers. Does the view call the network directly? Is there business logic in the body? How many different ways of making an HTTP request coexist in the project?
Count the error handling. Every try! and every force unwrap (!) on a production path is a crash waiting its turn. The AI adds them because they shorten the path to "it works".
Check what the AI chose NOT to write. Empty state, network failure, slow network, background relaunch, maxed-out Dynamic Type, the VoiceOver path, RTL languages. That's Osmani's 30%, and it's where production lives.
Turn on Swift 6 strict concurrency in CI and treat every diagnostic as information, not noise to silence.
Run it through the App Review filter. Look for private or deprecated APIs, check that the interface follows the Human Interface Guidelines and that the app offers real native functionality. Review basic accessibility: VoiceOver, Dynamic Type, correct interactive elements. It's cheaper to find it yourself than to get Apple's rejection.

The golden rule, which I borrow from Simon Willison: don't ship code to production you can't explain to someone else. If nobody on the team can explain what a piece does and why, it isn't finished, however well it compiles.

When an external audit is worth it

You don't always need one. If you have a senior iOS team with the bandwidth to review thoroughly, do it in-house. AI, well steered and with genuine senior review, is an extraordinary accelerator —it is for me every day—.

It's worth bringing in someone external when some of these signals show up: you inherited an app built by a non-technical profile and you can't scale it; Apple has rejected your app and you're not sure how to bring it into compliance; every change breaks things that seemed unrelated; nobody on the team can explain why the app behaves the way it does; the app handles sensitive data or payments and nobody has reviewed the security; or you're rushing towards production and you need a second senior opinion before the cost of mistakes spirals.

What an AI does in minutes can take months to fix if nobody looked at the code while it was being written. Today's saving is, far too often, the technical debt of three months from now. Not because the tool is bad, but because productivity without review is an illusion that charges interest.

Frequently asked questions

AI gets you easily to 70% of the solution: the app compiles, passes the tests and runs in the simulator. The remaining 30% —edge cases, error handling, concurrency, security and architectural fit— is exactly where production lives, and it doesn't show its face until the network drops, memory runs out or there are thousands of users at once.

It's an escape hatch that silences Swift 6's strict concurrency errors. The AI uses it to make the code compile, not to solve the problem. When it appears, it almost always means there's a real, unfixed data race waiting to surface as an impossible-to-reproduce crash.

No. UserDefaults isn't encrypted and is read straight from a device backup. Tokens, credentials and secrets should be stored in the Keychain with the right accessibility. Storing a session token in UserDefaults leaves sensitive data in plain text.

Look for compiler silencers, review secret storage, check state ownership, see whether the view calls the network directly, count the force unwraps on production paths, and verify what the AI chose not to write: empty states, network failures and accessibility.

When you inherit an app built by a non-technical profile and you can't scale it, when Apple has rejected it, when every change breaks unrelated things, when nobody on the team can explain why the app behaves the way it does, or when it handles sensitive data or payments and nobody has reviewed the security.

Have you inherited an iOS app built with AI and can't scale it, or do you want to validate your code before shipping? At AtalayaSoft we do AI-generated iOS code audits from real-world architecture experience on apps with millions of users. We tell you what's broken, what's urgent and how to fix it without throwing away your product.

Let's talk →

About the author

Francisco José García Navarro

Francisco José García Navarro is the co-founder and Senior iOS Architect at AtalayaSoft, with over 25 years in software development and 11+ in native iOS. Throughout his career he has worked with high-profile clients such as Zara (Inditex), Banco Santander, AXA, El País, National Geographic, Fox International Channels, and the Thyssen-Bornemisza Museum.

I've reviewed AI-written iOS code: what actually breaks in production

The productivity mirage: why vibe-coded code looks finished and isn't

Architecture: the first thing the AI ignores

Concurrency and memory in Swift: the failures you don't see until production

State management in SwiftUI: the pattern the AI gets wrong

Security and data: what slips through when nobody reviews

Before production comes App Review: what Apple rejects even when it compiles

How to audit AI-generated iOS code before you ship

When an external audit is worth it

Frequently asked questions

Why does AI-generated iOS code compile but fail in production?

What is nonisolated(unsafe) and why is it a warning sign?

Is it safe to store session tokens in UserDefaults on iOS?

How do you audit AI-generated iOS code before shipping?

When is an external iOS code audit worth it?

About the author

Francisco José García Navarro