Compound Engineering: 5x More Iterations by Matching Model Speed to Thinking Speed
@kieranklaassen
Compound AI systems that match model speed to task type can multiply creative output 5x — making model selection strategy as important as model capability for leaders building AI-powered operations.
Compound Engineering: 5x More Iterations by Matching Model Speed to Thinking Speed
By Kieran Klaassen
The Insight Nobody Talks About
Everyone debates which AI model is best. The more useful question is: which model is right for this moment in your workflow?
I've been running a compound engineering system for Coras — 27 agents, 21 commands, and 14 skills working in concert to handle everything from triaging GitHub issues to planning features to iterating on UI. When GPT-5.3-Codex-Spark became available, I put it through its paces inside that system. What I found reframed how I think about model selection entirely.
Speed as a Creative Multiplier
Spark isn't the most powerful model I use. But in tasks built around brainstorming and rapid iteration — UI design cycles, feature exploration, quick planning loops — it's the right tool. The reason is simple: iteration speed compounds.
During a recent design sprint on the Coras UI, I ran approximately 10 design iterations in the time a heavier model would have completed 2–3. That's not a marginal improvement. That's a fundamentally different creative process. When feedback loops tighten, you think differently. You take more swings. You find better answers.
The heavier models still have their place — deeper reasoning, complex architecture decisions, nuanced code generation. But forcing those models into every step of a workflow is like driving at 20 mph on a motorway because your vehicle can go 120. You're paying a cost in time and momentum that rarely shows up on a benchmark, but shows up every day in your output.
Upgrading the Production Stack
Separately, I upgraded Coras's email classification and summarization pipeline from Gemini Flash 2.0 to Flash 2.5. The results were clear across three dimensions:
- Classification accuracy improved — emails routed more reliably to the right categories
- Summaries got cleaner — less noise, more signal in the output
- Reliability under high demand held up — no degradation when volume spiked
This is live for all Coras users now. It's a reminder that incremental model upgrades within existing pipelines often deliver outsized value with low implementation risk — especially when you're upgrading within a model family where the API interface stays stable.
What This Means for Your AI Architecture
If you're building or scaling AI workflows, the compound engineering model — multiple specialised agents, commands, and skills working together — gives you a critical advantage: you can route tasks to the right model for the right moment.
That means:
- Fast models for high-frequency, creative, or exploratory tasks — brainstorming, iteration, triage, drafting
- Powerful models for low-frequency, high-stakes decisions — architecture, complex reasoning, final review
- Regular model upgrades within each role — as new versions release, swap them into existing slots and measure the delta
The goal isn't to find the one best model. It's to build a system intelligent enough to use the right tool at the right time — and fast enough to actually outpace the way you think.
The Compound Effect
Twenty-seven agents. Twenty-one commands. Fourteen skills. None of that complexity matters if the system is slow enough to break your thinking rhythm. Matching model speed to thinking speed isn't a technical detail — it's the difference between AI that accelerates you and AI that you wait for.