Preparing Your
iOS Codebase for AI Agents
Documentation, tooling, and skills for agent-friendly development
Hesham Salman
SAY: "AI coding agents are remarkably capable. They can read your code, write new features, run tests, and fix their own mistakes. But there's a catch — they're only as good as the context you give them. Drop an agent into a bare codebase with no guidance and it will confidently write code that doesn't match your project at all. This talk is about fixing that."
PAUSE briefly, then advance to the next slide.
TARGET TIME: ~30 seconds on this slide.
The Problem
What happens when an agent has no context
Uses XCTest instead of Swift Testing
Calls xcodebuild with wrong flags
Ignores your TCA architecture entirely
Puts files in the wrong directories
The agent isn't dumb.
It just doesn't have your context.
SAY: "Let me paint you a picture. You point an agent at your iOS project and ask it to write a feature."
[CLICK → fragment 1]
SAY: "First thing it does: writes tests in XCTest. Your team moved to Swift Testing six months ago. All the existing tests use the new framework. But the agent defaulted to what it knows from training data."
[CLICK → fragment 2]
SAY: "Then it tries to build. It shells out to xcodebuild with generic flags. Misses the correct scheme, the right simulator, the derived data path your CI depends on. The build fails with a cryptic error."
[CLICK → fragment 3]
SAY: "It writes the feature code as a plain MVVM view model. Your entire app is built on TCA — The Composable Architecture. Reducers, effects, the whole thing. The agent has no idea."
[CLICK → fragment 4]
SAY: "And it drops the new files in the root of the project instead of the correct module directory."
[CLICK → fragment 5]
SAY: "Here's the thing — the agent isn't dumb. It's actually quite capable. It just doesn't have the context that every developer on your team carries in their head. Your conventions, your architecture decisions, your build setup. None of that is in the code itself."
TARGET TIME: ~2 minutes total by end of this slide.
The Three Layers
Each layer builds on the previous
1. Documentation
AGENTS.md hierarchy. The operating contract.
2. Tooling
Makefile + worktrees. The executable interface.
3. Skills
Executable methodology. The how-to guides.
SAY: "The solution comes in three layers."
[CLICK → fragment 1]
SAY: "First, documentation. Specifically, a hierarchy of AGENTS.md files that give the agent its operating contract — naming conventions, architecture decisions, testing philosophy."
[CLICK → fragment 2]
SAY: "Second, tooling. A Makefile that wraps your entire build system into simple commands the agent can actually run. No GUI required."
[CLICK → fragment 3]
SAY: "Third, skills. These are step-by-step workflows that encode your team's methodology. Not just what to do, but how to do it."
SAY: "Each layer builds on the previous one. Documentation tells the agent what matters. Tooling gives it the ability to act. Skills tie it all together into repeatable workflows. Let's go through each one."
TARGET TIME: ~2.5 minutes total.
Hierarchical AGENTS.md
Three levels of context, from broad to specific
Root
The team handbook. Naming conventions, VCS workflow, testing philosophy, troubleshooting guides.
Subsystem
The iOS playbook. Build commands, architecture decisions, banned patterns, platform-specific conventions.
Module
Tribal knowledge. Gotchas only someone working in this directory would know. The things you'd tell a new teammate.
Key principle: The subsystem refines, not replaces, the root.
SAY: "AGENTS.md files work in a hierarchy — just like how your codebase is organized."
[CLICK → fragment 1]
SAY: "At the top is the root AGENTS.md. This is your team handbook. It covers everything that applies everywhere: naming conventions, version control workflow, how to run tests, general troubleshooting steps. Every agent session starts by reading this file."
[CLICK → fragment 2]
SAY: "Below that is the subsystem level. For iOS, this is where you document your build commands, your architecture — TCA, MVVM, whatever you use — banned patterns like force unwraps or singletons, and platform-specific conventions. This file lives in your iOS directory."
[CLICK → fragment 3]
SAY: "At the bottom is the module level. These are small, focused files that capture the kind of knowledge you'd share with a new teammate sitting next to you. 'Hey, this module has a weird dependency ordering issue.' 'The mock setup here requires this specific pattern.' Tribal knowledge, written down."
[CLICK → fragment 4]
SAY: "The key principle: each level refines, not replaces, the parent. The subsystem doesn't restate everything in the root. It adds iOS-specific detail. The module doesn't restate the subsystem. It adds directory-specific gotchas. This keeps files short and avoids contradictions."
TARGET TIME: ~4.5 minutes total.
Keep It Lean
More docs does not mean better docs
Shorter docs = more effective.
Every redundant line pushes actual code out of the context window.
SAY: "Here's the counterintuitive lesson. Our first AGENTS.md draft was 800 lines. We were thorough — documented everything. And the agent's performance was mediocre."
SAY: "We cut it to 415 lines. Agent performance improved immediately. Why? Because context windows are finite. Every redundant line of documentation pushes out actual source code the agent needs to see."
[CLICK → fragment 1]
SAY: "The takeaway: shorter documentation is more effective documentation."
TARGET TIME: ~5.5 minutes total.
How to Trim
Four rules for keeping docs lean
1. Remove anything derivable from code
2. Remove anything already in the root
3. Extract reference material, keep the rule inline
4. Replace directory trees with tables
Treat docs like code.
An accessibility section went from 120 lines to 2 lines plus a link. Same information, 98% less context consumed.
[CLICK → fragment 1]
SAY: "Rule one: remove anything the agent can derive from reading the code. Don't document your class hierarchy — the agent can see it."
[CLICK → fragment 2]
SAY: "Rule two: remove anything that's already covered in the root. The subsystem inherits the root. Don't repeat it."
[CLICK → fragment 3]
SAY: "Rule three: extract reference material into separate files and keep just the rule inline. Our accessibility section was 120 lines of examples. We replaced it with two lines: 'All interactive elements must have accessibility labels. See docs/accessibility.md for examples.'"
[CLICK → fragment 4]
SAY: "Rule four: replace ASCII directory trees with compact tables. Same information, half the lines."
[CLICK → fragment 5]
SAY: "Treat your AGENTS.md like you'd treat code — refactor it, keep it lean."
TARGET TIME: ~6.5 minutes total.
Give Your Agents a Makefile
Agents can't click Cmd+B
make build # Build the Prod variant
make test # Run all tests
make test FILTER=… # Filter to specific tests
make format # Run SwiftFormat
make modules # List all SPM modules
make setup # First-time project setup
Each command encodes the correct flags , simulators , and sequencing .
Miss one flag and you get a cryptic error 3 minutes into a build.
The Makefile IS the institutional knowledge.
Which scheme, which simulator, which derived data path, which Swift flags. All encoded once, used everywhere.
SAY: "Here's a fundamental problem: agents can't use Xcode. They can't click buttons, they can't use the GUI, they can't press Cmd+B. They need a CLI interface. And for most iOS projects, that CLI interface doesn't exist."
SAY: "The solution is a Makefile. Six commands that cover everything an agent needs to do."
SAY: "make build — builds the production variant with the correct scheme, simulator, and derived data path. make test — runs all tests with the right flags. make test with a FILTER argument — targets specific test suites. make format — runs SwiftFormat so the agent's code matches your style. make modules — lists every SPM module in the project, so the agent knows what exists. make setup — handles first-time setup like resolving packages."
[CLICK → fragment 1]
SAY: "This is more important than it sounds. Each command encodes dozens of decisions: the right Xcode scheme, the right simulator device, the correct derived data path, the right Swift compiler flags. Get one wrong and you get an inscrutable error three minutes into a build. The agent wastes time debugging infrastructure instead of writing features."
[CLICK → fragment 2]
SAY: "The Makefile is institutional knowledge made executable. Instead of documenting all the flags in a README that nobody reads, you encode them once and they work every time."
TARGET TIME: ~7.5 minutes total.
Block Direct Access
Wrapping isn't enough. You need a blocklist.
## IMPORTANT: Always Use Makefile Commands
**Do NOT call these tools directly:**
- `xcodebuild` → use `make build` or `make test`
- `tuist build` → use `make build`
- `tuist test` → use `make test`
- `tuist install` → use `make install`
Without blocklist
Agent calls xcodebuild directly
Wrong flags, wrong simulator
Cryptic error, 5 minutes wasted debugging
With blocklist
Agent uses make build
Correct flags every time
Just works
The blocklist makes the Makefile the path of least resistance.
Don't rely on the agent choosing the right tool. Remove the wrong tools from its vocabulary.
SAY: "We learned this the hard way. Wrapping your build system in a Makefile isn't enough — you also need to tell the agent NOT to use the underlying tools directly."
SAY: "Without a blocklist, agents will reach for xcodebuild. It's in their training data. They know the API. And they'll use it with generic flags that are wrong for your project."
[CLICK → fragment 1]
SAY: "The difference is dramatic. Without a blocklist, the agent calls xcodebuild with wrong flags, gets a cryptic error, and spends five minutes trying to debug it. With the blocklist, it uses make build and everything works on the first try."
[CLICK → fragment 2]
SAY: "The blocklist makes the Makefile the path of least resistance. You're not fighting the agent's defaults — you're redirecting them."
TARGET TIME: ~8.5 minutes total.
Worktree-Aware Tooling
Parallel agents need isolated environments
# Dynamic simulator per worktree
SIMULATOR = $(shell ./Tools/simulator-clone.sh get \
2>/dev/null || echo "$(BASE_SIMULATOR)")
# Project-local DerivedData
DERIVED_DATA = $(PWD)/.build/DerivedData
1. Each worktree gets its own cloned simulator
2. DerivedData stays project-local , no cross-contamination
3. make build works identically in every worktree
Invisible to the agent.
It runs make build and it just works. The isolation is handled by the Makefile, not by the agent.
SAY: "When you run multiple agents in parallel — and you will — they fight over shared resources. Two agents trying to use the same simulator at the same time will cause failures. Two agents writing to the same DerivedData directory will corrupt each other's builds."
SAY: "The solution is worktree-aware tooling. Each git worktree gets its own isolated environment."
[CLICK → fragment 1]
SAY: "First, each worktree gets its own cloned simulator. The script detects whether you're in a worktree and either creates or reuses a simulator clone. No conflicts."
[CLICK → fragment 2]
SAY: "Second, DerivedData is project-local. It lives inside the worktree directory, not in the shared ~/Library path. No cross-contamination between parallel builds."
[CLICK → fragment 3]
SAY: "Third, make build works identically in every worktree. Same command, same behavior, different isolated environment."
[CLICK → fragment 4]
SAY: "The best part: the agent doesn't know any of this is happening. It runs make build. The Makefile handles the rest. The isolation is infrastructure, not something the agent needs to think about."
TARGET TIME: ~9.5 minutes total.
Skills: Executable Methodology
Documentation tells agents WHAT. Skills tell them HOW.
1
Research
Read the implementation and understand the domain
2
Analyze gaps
Find coverage gaps and untested paths
3
Find examples
Locate similar tests in the codebase as templates
4
Write tests
Follow templates, use DI, match project conventions
5
Verify
Run make test and fix any failures
6
Review
Check quality, naming, and coverage completeness
Without the skill: agents use XCTest instead of Swift Testing,
skip DI , and reinvent mock setup
that already exists.
SAY: "Documentation tells agents what your conventions are. Skills tell them how to apply those conventions step by step. Here's our test-writing skill."
[CLICK → fragments 1: steps 1-2 appear]
SAY: "Steps 1 and 2: the agent researches the implementation and analyzes coverage gaps. It reads the code it's about to test and figures out what's missing — untested branches, error paths, edge cases."
[CLICK → fragment 2: steps 3-4 appear]
SAY: "Steps 3 and 4: before writing anything, the agent finds similar tests in the codebase. These become templates. Then it writes the tests following those templates — using your DI patterns, your mock setup, your naming conventions."
[CLICK → fragment 3: steps 5-6 appear]
SAY: "Steps 5 and 6: the agent runs the tests with make test, fixes any failures, and does a quality review. Does the naming match conventions? Is coverage complete?"
[CLICK → fragment 4]
SAY: "Without this skill, agents default to what they know from training data. They use XCTest instead of Swift Testing. They skip dependency injection. They create their own mock objects instead of using the ones your team already built. The skill makes the right approach the easy approach."
TARGET TIME: ~11 minutes total.
Design System Compliance
Make the design system the only practical way to style UI
@Environment(\.appTheme) private var theme
Text("Hello")
.themeForeground(.primary)
.themePadding(.md)
.themeCornerRadius(.lg)
1. Token protocols define every visual property: colors, spacing, radii, typography, shadows
2. Convenience modifiers accept semantic levels , not raw values
3. TOKENS.md documents every token with correct/incorrect examples
4. Lint rules flag violations; make build runs them as errors , not warnings
5. Preview + snapshot tests verify the output visually
Three layers of enforcement.
The API makes the right approach easy. Lint rules make the wrong approach a build failure. Previews catch everything else.
SAY: "Here's a subtler problem. Agents can produce UI that compiles and looks plausible but doesn't match your design system. Hardcoded colors, arbitrary spacing, corner radii that are close but not right. These slip through code review because they render fine at first glance."
SAY: "Our approach: make the design system the only practical way to style UI. The architecture does the enforcement, not documentation."
[CLICK → fragment 1: code block appears]
SAY: "Every view accesses the design system through a single environment injection point. Then convenience modifiers like themeForeground, themePadding, and themeCornerRadius accept semantic levels — primary, md, lg — not raw values. There is no Color.blue or .padding(16) in the vocabulary."
[CLICK → fragment 2]
SAY: "The foundation is a protocol layer that defines every visual property as a token: colors, spacing, radii, typography, shadows, stroke widths. A concrete implementation maps each token to your asset catalog values."
[CLICK → fragment 3]
SAY: "The convenience modifiers are the key. They accept semantic levels, not numbers. An agent reaching for inline styles would have to actively fight the API to do it wrong."
[CLICK → fragment 4]
SAY: "A TOKENS.md file in the module serves as the agent's design reference. Every token category documented with correct and incorrect code examples."
[CLICK → fragment 5]
SAY: "Here's the safety net. Custom SwiftLint rules flag direct use of Color, inline padding with literal values, anything that bypasses the token system. In Xcode these are warnings. But make build passes RUN_SWIFTLINT=1 with warnings as errors. So if an agent tries to use Color.blue instead of themeForeground, the build fails. And remember — the blocklist forces agents through the Makefile. They can't sidestep the lint rules by calling xcodebuild directly."
[CLICK → fragment 6]
SAY: "And preview builds plus snapshot tests across multiple configurations — light, dark, large text, accessibility sizes — catch anything that looks wrong even when it passes lint."
[CLICK → fragment 7: takeaway appears]
SAY: "Three layers of enforcement. The API makes the right approach easy. Lint rules make the wrong approach a build failure. And previews catch the rest. Each layer catches what the previous one misses."
TARGET TIME: ~13 minutes total.
Visual Verification
Compiling is not the same as working
1
Screenshot
Capture the screen to see what's rendered
2
View hierarchy
Snapshot the element tree for structure analysis
3
Capture logs
Check for runtime errors and warnings not visible in the UI
<!-- Simplified view hierarchy snapshot -->
<AXElement type="button" label="Submit" enabled="true">
<AXElement type="text" value="Submit Order" />
</AXElement>
<AXElement type="image" label="" enabled="true" /> ⚠️ Missing label!
Side effect: missing accessibility labels fail the verification.
The agent reviews evidence, not just compilation.
Observe, don't tap. Built on XcodeBuildMCP.
SAY: "Here's a gap most teams don't think about: just because code compiles and tests pass doesn't mean the UI is correct. An agent can add a button that's behind a navigation bar, or a label that's truncated. It compiles fine. Tests pass. But it's broken."
[CLICK → fragment 1]
SAY: "Step one: screenshot the simulator screen. The agent can literally see what was rendered."
[CLICK → fragment 2]
SAY: "Step two: snapshot the view hierarchy. This gives the agent the element tree — every view, its type, its position, its accessibility properties."
[CLICK → fragment 3]
SAY: "Step three: capture logs. Runtime warnings, constraint violations, anything the system is complaining about."
[CLICK → fragment 4: code block + accessibility callout appear together]
SAY: "Here's what a simplified view hierarchy snapshot looks like. The agent can see there's a button with a proper accessibility label, and an image with a missing label — that's a bug it can catch and fix. And here's a great side effect: missing accessibility labels automatically fail the verification. Accessibility isn't a separate audit anymore — it's built into the agent's workflow."
[CLICK → fragment 5: takeaway appears]
SAY: "The key idea: the agent reviews evidence, not just compilation results. It observes but doesn't tap — it's looking at the output, not interacting with it. This is built on XcodeBuildMCP, which provides the bridge between the agent and the simulator."
TARGET TIME: ~14.5 minutes total.
Self-Maintaining Docs
The feedback loop that keeps documentation honest
"AGENTS.md files are living documents. Update them when you discover undocumented conventions, encounter patterns not covered by existing guidance, or find that current instructions lead to mistakes."
The guardrail: Every change must leave the document shorter or more useful.
Docs stay current because the entities reading them are also updating them.
Unlike human documentation that gets written once and slowly drifts, agent documentation evolves with every session.
SAY: "This is the most underrated part of the whole system. Documentation that doesn't evolve is documentation that lies. And agent documentation gets read at the start of every single session — so stale docs cause failures immediately, not six months later."
[CLICK → fragment 1]
SAY: "We include this instruction directly in the AGENTS.md itself. The agents are told: if you discover something undocumented, if existing instructions led you astray, update the docs. The documentation becomes a living document maintained by its own readers."
[CLICK → fragment 2]
SAY: "But we need a guardrail. Without one, docs grow forever. So we add this constraint: every change must leave the document shorter or more useful. You can add a new rule, but you should also look for something to remove or consolidate. This keeps the docs from bloating over time."
[CLICK → fragment 3]
SAY: "The result is a feedback loop. Agents read the docs, discover gaps, and fill them in. The next agent session benefits from the update. Over time, the documentation converges on exactly what agents need to know — no more, no less."
TARGET TIME: ~15 minutes total.
Takeaways
Context > capability.
Agents are only as good as the context you give them. An agent with good docs, a Makefile,
and skills outperforms a "smarter" agent that's flying blind.
Make the right approach the easy approach.
Blocklists, skills, templates. Don't fight the agent's defaults; redirect them.
When the correct workflow is also the path of least resistance, the agent follows it naturally.
Documentation is infrastructure.
Unlike human docs that get read once, agent docs get read at the start of every session.
It's the highest-leverage investment you can make in your agent workflow.
SAY: "Let me leave you with three takeaways."
[CLICK → fragment 1: first takeaway appears]
SAY: "First: context beats capability. A well-equipped agent with good documentation, a solid Makefile, and clear skills will outperform a more capable agent that's flying blind. Invest in context."
[CLICK → fragment 2: second takeaway appears]
SAY: "Second: make the right approach the easy approach. Blocklists remove wrong choices. Skills encode right choices. Templates make right choices fast. You're not fighting the agent — you're shaping its environment so the right thing happens naturally."
[CLICK → fragment 3: third takeaway appears]
SAY: "Third: documentation is infrastructure. This is the mental shift. Human documentation gets written once and maybe read twice. Agent documentation gets read at the start of every single session. Every improvement to your AGENTS.md pays dividends on every future task. It's the highest-leverage investment you can make."
PAUSE.
SAY: "Thank you."
TARGET TIME: ~16 minutes total.
Read more on Sunday Swift
Thank you