The Hidden Cost of Apple’s Diffable Data Sources
When Apple introduced UICollectionViewDiffableDataSource, it solved a real problem: you describe your data as an immutable snapshot, hand it to the data source, and it figures out what changed and animates the difference. No more manual performBatchUpdates calls, no more index math, no more crashes from inconsistent state. The model is good.
But as I started using it in production (feeds with thousands of items, settings screens with dynamic sections, catalog views backed by network responses), I kept hitting the same friction points. Building a snapshot with 10,000 items takes over a millisecond. Querying itemIdentifiers a hundred times takes 46 milliseconds. NSDiffableDataSourceSnapshot is an Objective-C class bridged to Swift, so you pay for message dispatch, reference counting, and bridging overhead on every operation. It’s not Sendable, so building a snapshot off the main thread means working around the type system rather than with it.
None of these are dealbreakers in isolation. But they compound, especially in SwiftUI or TCA apps where the time spent resolving a snapshot is enough to trigger an observation cascade. What starts as a slow snapshot apply becomes a chain of state updates and view invalidations that snowball into visible hangs. I wanted to keep the snapshot model while fixing the performance and concurrency story underneath it. Lists is the result: a pure-Swift diffable data source framework that’s API-compatible with Apple’s types but built on a different foundation.
Architecture: Two Libraries, One Goal
Lists is split into two libraries with a deliberate separation of concerns:
ListKit is the engine. It contains the O(n) Heckel diff algorithm, the custom CollectionViewDiffableDataSource, and the snapshot types (DiffableDataSourceSnapshot and DiffableDataSourceSectionSnapshot). If you want full control over your collection view data source, you use ListKit directly. It’s a drop-in replacement for Apple’s types with the same API shape.
Lists is the convenience layer built on top of ListKit. It provides the CellViewModel protocol, pre-built list configurations (SimpleList, GroupedList, OutlineList), result builder DSLs for declarative snapshot construction, mixed cell type support via AnyItem, and first-class SwiftUI wrappers. If you want to go from zero to a working list with minimal code, this is where you start.
This separation matters. ListKit has no opinions about how you structure your cells or manage your data. Lists has strong opinions about both. By layering them, you can adopt the performance wins without buying into the convenience abstractions, or use both together.
API Design: CellViewModel and the Data-Driven Model
The core abstraction in Lists is CellViewModel:
public protocol CellViewModel: Hashable, Sendable {
associatedtype Cell: UICollectionViewCell
@MainActor func configure(_ cell: Cell)
}
This is a deliberately small protocol. A CellViewModel knows two things: what cell class it configures, and how to configure it. That’s it. The framework handles registration, dequeuing, and reuse automatically. You never call register(cellClass:forReuseIdentifier:).
Here’s what a real implementation looks like:
struct ContactItem: CellViewModel, Identifiable {
let id: String
let name: String
let subtitle: String
func configure(_ cell: UICollectionViewListCell) {
var content = cell.defaultContentConfiguration()
content.text = name
content.secondaryText = subtitle
cell.contentConfiguration = content
}
}
The Identifiable conformance here is doing more than you’d expect. When a CellViewModel also conforms to Identifiable, Lists automatically synthesizes Hashable and Equatable based on id only. This is an intentional design choice: the diff algorithm uses identity to match elements between old and new snapshots, and content changes (like updating a subtitle) should trigger a reconfigureItems() call rather than a delete-and-insert. This matches how UICollectionView is designed to work. Identity-based diffing is both faster and produces better animations than content-based diffing.
The trade-off is subtle but important: two items with the same id but different content are considered equal by the diff. If you change a contact’s name, the diff won’t detect it as a change. You need to explicitly call reconfigureItems() for content-only updates. This is documented and intentional, but it’s worth understanding before you adopt the pattern.
Result Builder DSL
Building snapshots imperatively is verbose. You need to create an empty snapshot, append sections, append items to sections, and then apply. The result builder DSL replaces this with declarative syntax:
await dataSource.apply {
SnapshotSection("favorites") {
ContactItem(id: "1", name: "Alice", subtitle: "Engineering")
ContactItem(id: "2", name: "Bob", subtitle: "Design")
}
if showRecent {
SnapshotSection("recent") {
recentContacts
}
}
for team in teams {
SnapshotSection(team.id) {
team.members.map { member in
ContactItem(id: member.id, name: member.name, subtitle: team.name)
}
}
}
}
This supports if/else, for loops, optional chaining, and array passthrough. The builder produces the same DiffableDataSourceSnapshot you’d build imperatively. It’s syntax sugar, not a different data model.
Mixed Cell Types
Real apps rarely have lists with a single cell type. A settings screen might have toggle cells, navigation cells, and detail cells. A social feed mixes text posts, image posts, and ads.
Lists handles this through AnyItem, a type-erased wrapper:
let dataSource = MixedListDataSource<SectionID>(collectionView: collectionView)
await dataSource.apply {
MixedSection(.banner) {
BannerItem(title: "Summer Sale!")
}
MixedSection(.products) {
ProductItem(name: "Laptop", price: "$999")
ProductItem(name: "Phone", price: "$699")
}
}
AnyItem uses a few tricks to keep overhead low. Hash values are precomputed at construction time, so hashing an AnyItem is a single integer comparison rather than a witness table dispatch. Cross-type equality uses ObjectIdentifier as a fast-reject: if two AnyItem wrappers hold different concrete types, equality returns false immediately via a single pointer comparison. The actual equality closure is only invoked for same-type comparisons.
The overhead is measurable (about 1.5x on diffing versus concrete types), but the absolute numbers stay well within a 16ms frame budget even at 10,000 items.
Pre-Built Configurations
For common patterns, Lists provides three configurations that handle layout, data source setup, and delegate management:
// Flat list
let list = SimpleList<ContactItem>(appearance: .plain)
list.onSelect = { contact in showDetail(contact) }
await list.setItems(contacts)
// Sectioned list with headers
let grouped = GroupedList<String, ContactItem>(appearance: .insetGrouped)
await grouped.setSections([
SectionModel(id: "A", items: aContacts, header: "A"),
SectionModel(id: "B", items: bContacts, header: "B"),
])
// Hierarchical outline
let outline = OutlineList<FileItem>(appearance: .sidebar)
await outline.setItems([
OutlineItem(item: folder, children: [
OutlineItem(item: file1, children: []),
OutlineItem(item: file2, children: []),
])
])
Each configuration exposes onSelect, swipe action providers, and context menu providers. These are closure-based APIs that receive your typed model, not an IndexPath. You never have to manually resolve an IndexPath to an item.
Internally, swipe actions are handled by a SwipeActionBridge, a reference-type bridge that solves a classic Swift initialization ordering problem. The layout configuration needs a closure that references the data source, but the data source isn’t available until after super.init(). The bridge is created before super.init(), captured by the layout closure, and populated with the actual data source afterward. This is the kind of plumbing that’s invisible to the caller but required for correctness.
SwiftUI Integration
Each pre-built configuration has a corresponding SwiftUI wrapper: SimpleListView, GroupedListView, and OutlineListView. These are UIViewRepresentable types driven by @State:
struct ContactListView: View {
@State var contacts: [ContactItem]
var body: some View {
SimpleListView(
items: contacts,
onSelect: { contact in print(contact.name) },
trailingSwipeActionsProvider: { contact in
UISwipeActionsConfiguration(actions: [
UIContextualAction(style: .destructive, title: "Delete") { _, _, done in
contacts.removeAll { $0.id == contact.id }
done(true)
}
])
}
)
}
}
There’s also an inline content API that lets you skip CellViewModel entirely for simple cases:
SimpleListView(items: fruits) { fruit in
Text(fruit)
.font(.body)
}
This uses InlineCellViewModel under the hood, a type-erased CellViewModel that wraps a @ViewBuilder closure. It’s convenient for prototyping and simple lists, but for production cells with complex configuration, a proper CellViewModel type gives you better reuse and testability.
Pull-to-refresh is supported via an onRefresh async closure that automatically manages the UIRefreshControl lifecycle:
SimpleListView(
items: contacts,
onRefresh: {
contacts = await fetchContacts()
}
) { contact in
Text(contact.name)
}
The Heckel Diff: O(n) and Why It Matters
The heart of ListKit’s performance is its implementation of Paul Heckel’s 1978 diff algorithm1. This is the same algorithm used by IGListKit, but the implementations diverge significantly in structure and performance.
How the Algorithm Works
The Heckel diff runs in six passes over the old and new arrays:
- Pass 1: Scan the new array, building a symbol table that counts occurrences of each element.
- Pass 2: Scan the old array, updating the symbol table with old-side occurrences.
- Pass 3: Match unique elements, meaning items that appear exactly once in both arrays. These are anchor points for the diff.
- Pass 4: Forward expansion from matched elements. If element
iin the old array matched elementjin the new array, check ifi+1matchesj+1. - Pass 5: Backward expansion. Same idea, but checking
i-1againstj-1. - Pass 6: Collect results. Categorize everything into deletes, inserts, moves, and matched pairs.
The key insight is that passes 4 and 5 propagate matches outward from the unique anchors. Even if an element appears multiple times (and therefore can’t be uniquely matched in pass 3), it can still be matched by adjacency to a unique neighbor. This makes the algorithm robust to duplicates while maintaining O(n) time complexity.
The result is a DiffResult containing:
- deletes: Indices in the old array that have no counterpart in the new array.
- inserts: Indices in the new array that have no counterpart in the old array.
- moves: Pairs of
(from, to)indices for elements that exist in both arrays but changed position. - matched: Pairs of
(old, new)indices for all elements that exist in both arrays.
Every old index appears exactly once in either deletes or matched. Every new index appears exactly once in either inserts or matched. This invariant is critical. Violating it crashes UICollectionView during batch updates.
How This Differs from IGListKit
Both ListKit and IGListKit implement the Heckel algorithm. But the similarity ends at the algorithmic level. The implementations are structurally different, and those structural differences drive significant performance gaps.
Language and dispatch overhead. IGListKit is Objective-C++. Every element comparison goes through isEqual: and hash, which are Objective-C message sends. ListKit is pure Swift with Hashable conformance, which compiles down to direct function calls. For 10,000 element comparisons, this alone accounts for measurable overhead.
Data structure choices. ListKit uses ContiguousArray for its internal symbol and entry tables. ContiguousArray guarantees that elements are stored in a contiguous memory block with no bridging overhead to NSArray. This gives better cache locality: when the CPU fetches one element, the next several elements are likely already in the cache line. IGListKit uses NSArray and NSDictionary, which have pointer-chase overhead and potential bridging costs.
Flat vs. two-level diffing. This is the biggest architectural difference. IGListKit diffs a flat array of id<IGListDiffable> objects. If you have sections, you flatten everything into a single array and manage section boundaries yourself through IGListSectionController. ListKit uses a two-level diff: it diffs section identifiers first, then diffs items within each surviving section, and finally reconciles cross-section moves.
This two-level structure has a massive performance implication: unchanged sections skip the item diff entirely. If you have 100 sections and update 3 of them, ListKit runs the Heckel algorithm on 3 sections. IGListKit runs it on the entire flattened array. In the common case of a no-change snapshot apply (e.g., re-applying the same data after a failed network request), ListKit detects array equality per-section and returns in microseconds. IGListKit still walks the full array.
Here are the benchmark numbers:
| Operation | IGListKit | ListKit | Speedup |
|---|---|---|---|
| Diff 10k items (50% overlap) | 10.8 ms | 3.9 ms | 2.8x |
| Diff 50k items (50% overlap) | 55.4 ms | 19.6 ms | 2.8x |
| Diff no-change 10k | 9.5 ms | 0.09 ms | 106x |
| Diff shuffle 10k | 9.8 ms | 3.2 ms | 3.1x |
The 106x speedup on no-change is the standout. Applying a snapshot that hasn’t changed is common in reactive architectures where state changes trigger full rebuilds regardless of whether the list data actually changed. Making this case effectively free changes the performance calculus for how aggressively you can re-apply snapshots.
Cross-Section Move Detection
One thing that IGListKit simply can’t do without significant manual bookkeeping is cross-section moves. If an item moves from section A to section B, IGListKit sees a delete from the flattened position and an insert at a different flattened position. It can’t tell the difference between “item moved” and “old item was deleted, new item was inserted.”
ListKit’s SectionedDiff handles this automatically. After computing per-section diffs, it scans for items that were deleted from one section and inserted into another. If the item identities match, it converts the delete+insert pair into a move. This produces correct animations: the cell slides from its old position to its new position rather than fading out and back in.
Performance: Where the Wins Come From
Beyond the diff algorithm itself, ListKit’s snapshot operations are dramatically faster than Apple’s.
| Operation | ListKit | Apple | Speedup |
|---|---|---|---|
| Build 10k items | 0.002 ms | 1.223 ms | 753x |
| Build 50k items | 0.006 ms | 6.010 ms | 1,045x |
| Build 100 sections x 100 items | 0.060 ms | 3.983 ms | 66x |
Query itemIdentifiers 100x | 0.051 ms | 46.364 ms | 908x |
| Reload 5k items | 0.099 ms | 1.547 ms | 16x |
These numbers come from three design decisions:
Flat Array Storage
DiffableDataSourceSnapshot stores data in parallel arrays: one ContiguousArray for section identifiers and one ContiguousArray of item arrays, one per section. There are no dictionaries, no hash maps on the critical read path. When UICollectionView asks “what item is at this index path?”, the answer is a double array subscript: O(1) with no hashing.
Apple’s implementation appears to maintain internal hash maps for reverse lookups (item-to-section, item-to-index), which are rebuilt on mutation. This explains the enormous gap on query operations: 908x on itemIdentifiers suggests Apple is doing O(n) work (or at least hash-table-traversal work) on every call.
Lazy Reverse Index
ListKit does maintain a reverse index (_itemToSection map) for operations like sectionIdentifier(containingItem:), but it builds this lazily. The typical lifecycle of a snapshot is: construct it, diff it against the current state, apply it. None of these steps need the reverse index. Only specific mutation operations (like moveItem(before:)) trigger its construction.
This means the build-then-diff path (the hot path in nearly every app) never allocates the reverse index. For 10,000 items, that saves roughly 0.5ms of dictionary construction that was never needed.
Value-Type Snapshots
DiffableDataSourceSnapshot is a struct. This means no reference counting, no class metadata overhead, and automatic Sendable conformance. You can build a snapshot on a background thread and await its application on the main thread without any concerns about thread safety.
This is a meaningful difference from Apple’s class-based NSDiffableDataSourceSnapshot. With Apple’s type, copying a snapshot creates a reference (or triggers an opaque internal copy). With ListKit’s type, the semantics are clear: a copy is a copy, mutations don’t alias, and the compiler enforces it.
Real-World Impact: From Saturated to Smooth
Synthetic benchmarks are useful for understanding where performance gains come from, but they don’t tell you what those gains feel like in a real app. I recently dropped ListKit into a production TCA app with a UICollectionView-backed inbox that paginates through hundreds of threads. The results were more dramatic than the microbenchmarks predicted, because of how Apple’s diffable data source interacts with reactive state management.
Here’s the setup: Instruments Time Profiler with hang detection, profiling the same workflow (load inbox, scroll through threads, paginate) on an iPhone 17 Pro Max. The baseline uses Apple’s UICollectionViewDiffableDataSource. The comparison build is a drop-in replacement with ListKit.
| Metric | Apple DiffableDS | ListKit | Change |
|---|---|---|---|
| Hangs/min (≥100ms) | 167.6 | 8.5 | -95% |
| Hang time/min | 35,480 ms | 1,276 ms | -96% |
| Microhangs (≥250ms) | 71 | 0 | Eliminated |
| Worst single hang | 752 ms | 237 ms | -68% |
The baseline is striking: the app spent 59% of the recording duration in a hang state. Not “slow.” Hung. And hangs were accelerating over time as pagination grew the thread list. Each page load added more items to the snapshot, and the cost of diffing and applying grew with it.
ListKit eliminated both the high baseline and the acceleration. Hang time stayed flat regardless of how many items were in the list. And the character of the remaining hangs changed: 91% of ListKit’s events are sub-100ms “potential interaction delays” (median: 45ms), with 64% clustering right at the 33ms detection floor. These are imperceptible. Zero microhangs at the ≥250ms level means the scrolling feels smooth, not just “less bad.”
What’s left in the trace is pure TCA overhead: RootCore._send → FilterProducer.receive → ScopedCore.state.getter dominates at 69% inclusive, and the leaf frames are almost entirely ARC operations (swift_retain, swift_release, objc_msgSend account for ~33% of self time). This is the irreducible cost of copying and propagating TCA’s value-type state tree. ListKit has removed itself from the equation entirely.
Why the Cascade Happens
The hang cause analysis tells the deeper story. In the baseline build, the dominant frame during hangs was TCA’s state propagation (FilterProducer.receive at 72% of hang CPU time), followed by SwiftUI cell re-renders (_UIHostingView at 40%) and Apple’s diff operation (__UIDiffableDataSource _applyDifferencesFromSnapshot at 6%).
That 6% looks small, but it’s misleading. Apple’s diff is the trigger for the cascade. When the diff takes long enough to block the main thread, it delays the completion of the snapshot apply. That delay means the next state update from TCA arrives while the previous one is still being rendered. SwiftUI’s observation system detects the state change and invalidates views, which triggers more cell re-renders, which triggers more state propagation. A slow diff doesn’t just cost its own time; it creates a feedback loop that multiplies the cost of everything downstream.
After switching to ListKit, the Apple diff frames disappeared entirely from the hang profile. _UIHostingView dropped from 9,476 ms/min to 15 ms/min, a 99.8% reduction. The total CPU burned during hangs dropped from 23,571 ms/min to 933 ms/min. The only remaining hang-time CPU was the irreducible cost of TCA’s value-type state copying (ARC operations like swift_retain and swift_release).
Projected at Scale
The acceleration data is what makes this most compelling for production apps. Projecting to a 5-minute session:
| Duration | Apple DiffableDS | ListKit |
|---|---|---|
| 2 min | 64.6s hang time | 2.9s hang time |
| 5 min | 183.9s hang time | 2.9s hang time |
The baseline gets worse over time because the cost of diffing grows with the item count. ListKit stays flat at 2.9 seconds regardless of session length. The per-section identity check means that paginating 50 new items into a list of 500 only diffs the one section that changed, not all of them.
Trade-Offs and What’s Missing
Every design has trade-offs. Here are the ones I’m aware of and deliberate about.
No drag-and-drop in pre-built configurations. SimpleList, GroupedList, and OutlineList don’t support drag reordering. Drag-and-drop requires tight integration with UICollectionViewDragDelegate and UICollectionViewDropDelegate, and abstracting this cleanly without leaking IndexPath into the public API is hard. If you need it, drop down to ListDataSource and implement the delegates yourself. It’s all there; the pre-built configurations just don’t wire it up.
No working range. IGListKit has a concept of “working range” that lets you prepare data for cells outside the visible area. This is useful for prefetching images or pre-computing layouts. Lists doesn’t have this. You can implement equivalent behavior with UICollectionViewDataSourcePrefetching, but it’s not integrated into the framework.
Identifiable auto-conformance is a footgun if you don’t understand it. When a CellViewModel conforms to Identifiable, equality and hashing are based on id only. This is fast and correct for diffing, but it means properties beyond id are invisible to the diff. If you’re used to Equatable comparing all properties, this will surprise you. The documentation calls this out explicitly, but it’s the kind of thing that bites you once before you internalize it.
Result builders slow down the compiler. The SnapshotBuilder and ItemsBuilder DSLs are expressive, but Swift result builders are not cheap to type-check. For very large or deeply nested builder expressions, you may notice increased build times. The imperative API is always available as a fallback.
AnyItem has measurable overhead. Type erasure isn’t free. AnyItem adds about 1.5x overhead to diffing and 3-4x overhead to snapshot construction compared to concrete types. At typical scales (hundreds to low thousands of items), this is well within frame budget. At 50,000+ items with mixed types, you’d want to benchmark your specific use case.
Swift 6 and Strict Concurrency
Lists is built for Swift 6 strict concurrency from the ground up. All snapshot types are Sendable value types. All data sources and configurations are @MainActor. The apply() methods are async.
Concurrency safety in the pre-built configurations is handled through task serialization. Each setItems / setSections call chains onto a stored applyTask, ensuring that concurrent calls are serialized rather than racing:
// Simplified version of the internal implementation
func setItems(_ items: [Item], animatingDifferences: Bool) async {
let task = Task { [applyTask] in
_ = await applyTask?.value
// build and apply snapshot
}
applyTask = task
await task.value
}
This means you can call setItems from multiple Task contexts without worrying about snapshot corruption. The calls will be serialized in order, and the caller awaits until its specific apply is done. No dispatch queues, no locks, no @unchecked Sendable escape hatches.
What’s Next
Lists is at version 0.1.0. The core is stable and tested, with over 100 test cases covering Heckel edge cases, batch update safety, and API equivalence between the DSL and imperative paths. CI runs on every PR, and DocC documentation is published to GitHub Pages.
Near-term priorities include drag-and-drop support for the pre-built configurations, additional ListAccessory cases, and batch update failure recovery. Longer-term, compositional layout integration beyond list layouts is on the radar: grids, carousels, and waterfall layouts that benefit from the same diffing engine.
The documentation site has quick-start guides for both UIKit and SwiftUI, and the source is on GitHub under MIT.
If you’re building anything with UICollectionView and you’ve felt the friction of Apple’s diffable data sources (the verbosity, the performance cliff on large data sets, the lack of Sendable), give Lists a look. It’s the library I wanted to exist when I started building it.
Footnotes
-
Paul Heckel, “A technique for isolating differences between files,” Communications of the ACM, vol. 21, no. 4, April 1978. ↩