Building Lists: A High-Performance Diffable Data Source Framework

This post is a companion to my talk, ListKit: A Fast, Pure-Swift Diffable Data Source.

The Hidden Cost of Apple’s Diffable Data Sources

When Apple introduced UICollectionViewDiffableDataSource, it solved a real problem: you describe your data as an immutable snapshot, hand it to the data source, and it figures out what changed and animates the difference. No more manual performBatchUpdates calls, no more index math, no more crashes from inconsistent state.

But as I started using it in production (feeds with thousands of items, settings screens with dynamic sections, catalog views backed by network responses), I kept hitting the same friction points. Building a snapshot with 10,000 items takes over a millisecond. Querying itemIdentifiers a hundred times takes 46 milliseconds. NSDiffableDataSourceSnapshot is an Objective-C class bridged to Swift, so you pay for message dispatch, reference counting, and bridging overhead on every operation. It’s not Sendable, so building a snapshot off the main thread means working around the type system rather than with it.

None of these are dealbreakers in isolation. But they compound. In a TCA app I work on, what started as a slow snapshot apply turned into a cascade of state updates and view invalidations that snowballed into visible hangs. More on that later.

I wanted to keep the snapshot model while fixing the performance and concurrency story underneath it. Lists is the result.

Why Heckel, and Why Again

I should back up. I used to contribute to IGListKit, and I worked closely with one of its original authors at GitHub. He’d built an internal variant there that focused on UITableView and automatic cell sizing, things that the open-source IGListKit didn’t prioritize. I got to see firsthand how much the Heckel diff algorithm could do when paired with the right data structure decisions. I also got to see where IGListKit’s Objective-C++ foundation became a ceiling.

The more common choice for collection diffs is the Myers algorithm (which is what CollectionDifference in the Swift standard library uses). Myers is optimal for minimizing edit distance, which makes it great for text diffs. But for UI, you don’t want minimal edits; you want correct animations. Myers doesn’t detect moves. If an item shifts from position 3 to position 7, Myers sees a delete at 3 and an insert at 7. The cell fades out and fades back in instead of sliding. Heckel detects moves natively because its six-pass design matches elements by identity, not position.

So the algorithm was never in question. What I wanted to rethink was everything around it: the data structures, the section model, the language-level overhead, and the concurrency story.

The Architecture

Lists is split into two libraries:

ListKit is the engine: the O(n) Heckel diff, the custom CollectionViewDiffableDataSource, and the snapshot types. It’s a drop-in replacement for Apple’s types with the same API shape. If you want full control, use ListKit directly.

Lists is the convenience layer: a CellViewModel protocol, pre-built list configurations (SimpleList, GroupedList, OutlineList), result builder DSLs, mixed cell type support via AnyItem, and SwiftUI wrappers. The documentation site has quick-start guides for both UIKit and SwiftUI.

I won’t walk through the full API here (that’s what the docs are for), but one design decision is worth calling out because it affects the diff. When a CellViewModel conforms to Identifiable, Lists auto-synthesizes Hashable and Equatable based on id only. Content changes (like updating a subtitle) don’t show up in the diff; instead, they trigger reconfigureItems(). This is deliberate: identity-based diffing is faster and produces better animations than content-based diffing. But it’s also a footgun if you expect Equatable to compare all properties. It bit me once before I internalized the pattern.

How the Heckel Diff Works

The heart of ListKit’s performance is Paul Heckel’s 1978 diff algorithm¹. It runs in six passes over the old and new arrays:

Pass 1: Scan the new array, building a symbol table that counts occurrences of each element.
Pass 2: Scan the old array, updating the symbol table with old-side occurrences.
Pass 3: Match unique elements (items that appear exactly once in both arrays). These are the anchor points.
Pass 4: Forward expansion from matched elements. If element i in the old array matched element j in the new array, check if i+1 matches j+1.
Pass 5: Backward expansion. Same idea, but checking i-1 against j-1.
Pass 6: Collect results. Categorize everything into deletes, inserts, moves, and matched pairs.

The key insight is that passes 4 and 5 propagate matches outward from the unique anchors. Even if an element appears multiple times (and therefore can’t be uniquely matched in pass 3), it can still be matched by adjacency to a unique neighbor. This makes the algorithm robust to duplicates while maintaining O(n) time complexity.

The result is a DiffResult where every old index appears exactly once in either deletes or matched, and every new index appears exactly once in either inserts or matched. This invariant is critical. Violating it crashes UICollectionView during batch updates.

Where ListKit Diverges from IGListKit

Both ListKit and IGListKit implement the Heckel algorithm. But the similarity ends there, and the structural differences are where the performance gaps come from.

Language and dispatch. IGListKit is Objective-C++. Every element comparison goes through isEqual: and hash, which are Objective-C message sends. ListKit is pure Swift with Hashable conformance, which compiles down to direct function calls. For 10,000 element comparisons, this alone is measurable.

Data structures. ListKit uses ContiguousArray for its internal symbol and entry tables, which guarantees a contiguous memory block with no bridging overhead to NSArray. This gives better cache locality: when the CPU fetches one element, the next several are likely already in the cache line. IGListKit uses NSArray and NSDictionary, which have pointer-chase overhead and potential bridging costs.

Flat vs. two-level diffing. This is the big one. IGListKit diffs a flat array of id<IGListDiffable> objects. If you have sections, you flatten everything into a single array and manage section boundaries yourself through IGListSectionController. That was always the part that felt wrong to me. UICollectionView thinks in sections and items; your data source should too.

ListKit uses a two-level diff: it diffs section identifiers first, then diffs items within each surviving section, and reconciles cross-section moves afterward. The performance implication is enormous: unchanged sections skip the item diff entirely. If you have 100 sections and update 3 of them, ListKit runs the Heckel algorithm on 3 sections. IGListKit runs it on the entire flattened array.

Operation	IGListKit	ListKit	Speedup
Diff 10k items (50% overlap)	10.8 ms	3.9 ms	2.8x
Diff 50k items (50% overlap)	55.4 ms	19.6 ms	2.8x
Diff no-change 10k	9.5 ms	0.09 ms	106x
Diff shuffle 10k	9.8 ms	3.2 ms	3.1x

For a straight diff with significant churn, ListKit is about 3x faster. That’s the language and data structure advantage. The dramatic numbers come from the two-level architecture. The 106x on no-change means ListKit detects per-section array equality and returns in microseconds, while IGListKit still walks the entire flat array. In reactive architectures where state changes trigger full snapshot rebuilds regardless of whether the list data actually changed, this case comes up constantly. The largest overhead in any list update will always be UICollectionView itself processing the batch updates; what matters is whether the work you control (diffing, snapshot construction) is fast enough to stay off the main thread’s critical path. That’s where these numbers make the difference.

Cross-Section Moves

One thing that IGListKit’s flat model fundamentally can’t express is a cross-section move. If an item moves from section A to section B, IGListKit sees a delete from one flattened position and an insert at another. It can’t distinguish “item moved” from “old item deleted, new item inserted.”

ListKit’s SectionedDiff handles this automatically. After computing per-section diffs, it scans for items that were deleted from one section and inserted into another with matching identities, and converts them into moves. The cell slides from its old position to its new position rather than fading out and back in.

Snapshot Performance vs. Apple

Beyond the diff itself, ListKit’s snapshot operations are dramatically faster than Apple’s:

Operation	ListKit	Apple	Speedup
Build 10k items	0.002 ms	1.223 ms	753x
Build 50k items	0.006 ms	6.010 ms	1,045x
Build 100 sections x 100 items	0.060 ms	3.983 ms	66x
Query `itemIdentifiers` 100x	0.051 ms	46.364 ms	908x
Reload 5k items	0.099 ms	1.547 ms	16x

These numbers reflect fundamentally different internal data structures. Apple’s NSDiffableDataSourceSnapshot appears to maintain internal hash maps for reverse lookups that are rebuilt on mutation; ListKit appends to ContiguousArrays and defers everything else. The itemIdentifiers gap (908x) suggests Apple is doing O(n) or hash-table-traversal work on every call where ListKit does a direct array return.

Again, the largest cost in a real apply cycle is UICollectionView processing the resulting batch updates, which neither framework controls. But these operations determine how long the main thread is blocked before UIKit even starts animating. In our production app, this was the difference between pinning the main thread and not.

Three design decisions drive these gaps:

Flat array storage. DiffableDataSourceSnapshot stores data in parallel ContiguousArrays. When UICollectionView asks “what item is at this index path?”, the answer is a double array subscript: O(1) with no hashing.

Lazy reverse index. ListKit maintains a reverse index (_itemToSection map) for operations like sectionIdentifier(containingItem:), but builds it lazily. The typical snapshot lifecycle (construct, diff, apply) never touches it. Only specific mutation operations trigger its construction. For 10,000 items, that saves roughly 0.5ms of dictionary construction that was never needed on the hot path.

Value-type snapshots. DiffableDataSourceSnapshot is a struct: no reference counting, no class metadata overhead, automatic Sendable conformance. You can build a snapshot on a background thread and await its application on the main thread with clear ownership semantics. Apple’s NSDiffableDataSourceSnapshot is a class where copying either creates a reference or triggers an opaque internal copy.

What This Looks Like in a Real App

Synthetic benchmarks tell you where wins come from. They don’t tell you what those wins feel like. I dropped ListKit into a production TCA app at work with a UICollectionView-backed inbox that paginates through hundreds of threads. The change was a drop-in replacement of the data source type. The results were more dramatic than the microbenchmarks predicted.

Here’s the setup: Instruments Time Profiler with hang detection, profiling the same workflow (load inbox, scroll through threads, paginate) on an iPhone 17 Pro Max.

Metric	Apple DiffableDS	ListKit	Change
Hangs/min (>=100ms)	167.6	8.5	-95%
Hang time/min	35,480 ms	1,276 ms	-96%
Microhangs (>=250ms)	71	0	Eliminated
Worst single hang	752 ms	237 ms	-68%

The baseline was striking: the app spent 59% of the recording duration in a hang state. Not “slow.” Hung. And hangs were accelerating over time as pagination grew the thread list. Each page load added more items to the snapshot, and the cost of diffing and applying grew with it.

ListKit eliminated both the high baseline and the acceleration. Hang time stayed flat regardless of list size. 91% of ListKit’s remaining events were sub-100ms (median: 45ms), with 64% clustering right at the 33ms detection floor. Zero microhangs at the >=250ms level.

Why the Cascade Happens

The deeper story is in the hang cause analysis. In the baseline, the dominant frame during hangs was TCA’s state propagation (FilterProducer.receive at 72% of hang CPU time), followed by SwiftUI cell re-renders (_UIHostingView at 40%) and Apple’s diff operation (__UIDiffableDataSource _applyDifferencesFromSnapshot at 6%).

That 6% looks small, but it’s misleading. Apple’s diff is the trigger for the cascade. When the diff takes long enough to block the main thread, it delays the completion of the snapshot apply. That delay means the next state update from TCA arrives while the previous one is still being rendered. SwiftUI’s observation system detects the state change and invalidates views, which triggers more cell re-renders, which triggers more state propagation. A slow diff doesn’t just cost its own time; it creates a feedback loop that multiplies the cost of everything downstream.

After switching to ListKit, the Apple diff frames disappeared entirely from the hang profile. _UIHostingView dropped from 9,476 ms/min to 15 ms/min (99.8% reduction). What remained was the irreducible cost of TCA’s value-type state copying: ARC operations like swift_retain and swift_release. ListKit removed itself from the equation entirely.

The Acceleration Problem

The acceleration data is what makes this most compelling for production apps. Projecting to a 5-minute session:

Duration	Apple DiffableDS	ListKit
2 min	64.6s hang time	2.9s hang time
5 min	183.9s hang time	2.9s hang time

The baseline gets worse over time because the cost of diffing grows with item count. ListKit stays flat at 2.9 seconds regardless of session length. The per-section identity check means that paginating 50 new items into a list of 500 only diffs the one section that changed.

Trade-Offs

No drag-and-drop in pre-built configurations. SimpleList, GroupedList, and OutlineList don’t support drag reordering. Abstracting this cleanly without leaking IndexPath into the public API is harder than it sounds. If you need it, drop down to ListDataSource and implement the delegates yourself.

No working range. IGListKit lets you prepare data for cells outside the visible area. Lists doesn’t have this. You can use UICollectionViewDataSourcePrefetching for equivalent behavior, but it’s not integrated into the framework.

AnyItem has measurable overhead. Type erasure adds about 1.5x overhead to diffing and 3-4x to snapshot construction compared to concrete types. At typical scales (hundreds to low thousands of items), this is within frame budget. At 50,000+ items with mixed types, benchmark your specific case.

Result builders slow down the compiler. The SnapshotBuilder DSL is expressive, but Swift result builders are not cheap to type-check. For very large or deeply nested builder expressions, you may notice increased build times. The imperative API is always available as a fallback.

Concurrency

Lists is built for Swift 6 strict concurrency. All snapshot types are Sendable value types. All data sources are @MainActor. The apply() methods are async.

Concurrency safety in the pre-built configurations uses task serialization. Each setItems / setSections call chains onto a stored applyTask, ensuring concurrent calls are serialized rather than racing:

func setItems(_ items: [Item], animatingDifferences: Bool) async {
    let task = Task { [applyTask] in
        _ = await applyTask?.value
        // build and apply snapshot
    }
    applyTask = task
    await task.value
}

No dispatch queues, no locks, no @unchecked Sendable escape hatches.

What’s Next

Lists is at version 0.1.0. Over 100 test cases cover Heckel edge cases, batch update safety, and API equivalence between the DSL and imperative paths. Near-term priorities are drag-and-drop support, additional ListAccessory cases, and batch update failure recovery. Longer-term, compositional layout integration beyond list layouts is on the radar.

The source is on GitHub under MIT. If you’re building anything with UICollectionView and you’ve felt the friction of Apple’s diffable data sources, give it a look.

Paul Heckel, “A technique for isolating differences between files,” Communications of the ACM, vol. 21, no. 4, April 1978. ↩