KAIzen — What Agile Needs for the AI Era
Source: Dev.to
How a small team at a gaming company went from 32 % flow efficiency to 85 % — by changing what we gave the AI
Our team was running Scrum by the book: two‑week sprints, grooming, planning poker, retros. By every conventional measure we were doing Agile correctly.
Then I measured our flow efficiency – the ratio of active work time to total elapsed time – and it was 32 %. For every hour on the clock we were actively working for about 19 minutes; the rest was waiting (for grooming, clarification, review, alignment on what the story actually meant).
Industry average for software teams is 15‑25 %.
We were above average, but “above average at wasting time” isn’t a metric anyone puts on a slide.
What made it worse was that we’d started using AI coding assistants. The promise was faster delivery. The reality was faster code generation – but the code was often wrong because the input was vague. A user story that says “As a user, I want to receive rewards so that I feel valued” gives a human enough context to ask smart questions; it gives an AI enough context to hallucinate confidently.
AI didn’t just speed up coding. It moved the bottleneck.
The bottleneck was no longer “how fast can we write code?” It became “how precisely can we define what we want?” Our entire process had been optimized for a world where humans were the bottleneck. That world was gone.
I should be honest: I didn’t call what followed KAIzen at the time. We didn’t have names for any of it. We just started changing how we worked. The vocabulary in this post – Blueprint, Runbook – came later, to make the patterns shareable. The work was real; the naming is an after‑thought.
The Inspiration — And Why I Needed Something Different
I wasn’t starting from zero. Amazon’s AI‑DLC (AI‑Driven Development Lifecycle) was a major inspiration. AWS had shown that spec‑driven, AI‑augmented development could work at scale. But when I looked at applying it to my team, the cost was high: AI‑DLC replaces your entire development process – new phases, new roles, new artifacts, a brand‑new way of working from the ground up.
We didn’t have that luxury. We were mid‑sprint, mid‑quarter, mid‑delivery. I needed something that could plug into our existing process – not replace it. Where AI‑DLC asks you to change everything, I wanted to change one thing: the quality of our input to AI. Keep our sprints, keep our board, add a layer on top.
I now call this approach KAIzen, from kaizen (改善) – the Japanese philosophy of continuous improvement. Small changes, led by the people who do the work. KAIzen applies that principle with AI as the lever. Not a new methodology. Not a process overhaul. A layer you add on top of whatever Agile process you already run.
Specification as the Primary Lever
The turning point was small. Instead of writing a user story, I wrote a detailed engineering spec for a feature – inputs, outputs, edge cases, constraints, acceptance criteria. I fed it to our AI assistant and the generated code was review‑ready on the first pass.
The previous feature – similar complexity, described as a user story – had taken three rounds of review, two Slack threads, and a sync meeting. Same AI. Same team. The difference was entirely in the input.
The spec is the product now. Not the code. The quality of your specification determines the quality of everything that follows.
I call this a Blueprint – a structured spec precise enough for AI to build against. For complex work you also need a Runbook – an ordered implementation plan derived from the Blueprint. For a small fix, a lightweight Blueprint is enough.
How we generate a Blueprint
- Feature brief – product owner provides goals, context, user needs.
- SpecKit – our custom GitHub Copilot agent drafts the Blueprint (inputs, outputs, edge cases, constraints, acceptance criteria).
- Review & refine – a developer spends real time (often up to two hours for complex features) polishing the draft.
The draft isn’t the artifact – the reviewed Blueprint is. That investment is the point. A precise Blueprint makes the Runbook coherent and the AI‑generated code review‑ready. The agent removes the blank‑page problem and gets you ~70 % of the way there; the developer’s judgment closes the last 30 % – and that’s where quality lives.
Over time something unexpected happened: our product owner started using the same agent to write the feature brief itself, structuring it so the downstream Blueprint would be cleaner. The whole chain tightened:
better brief → better Blueprint → better Runbook → better AI‑generated code → fewer review cycles.
The agent didn’t just help developers; it pulled the entire team toward precision.
What Dissolved
We didn’t decide to stop doing Scrum. We just started writing Blueprints inside our sprints. But several ceremonies dissolved on their own:
| Ceremony | What happened |
|---|---|
| Grooming | Redundant – the Blueprint already answered every question grooming was designed to surface. |
| Estimation | Stopped making sense – spec‑driven work is inherently scoped. |
| Sprint planning | Became pure prioritization: “which Blueprints next?” |
| Kanban vs. Scrum | Irrelevant – we kept the outer loop we needed (stand‑ups, retros, prioritization). The inner loop – spec‑first, AI‑augmented – drove results. |
The core difference from the AI‑DLC approach is that we didn’t need anyone’s permission to start. No process overhaul, no new roles, no org‑wide buy‑in. One team, one Blueprint, one sprint. The layer proved itself through results, not a proposal deck.
The Numbers
| Metric | Before | After 1 | After 2 |
|---|---|---|---|
| Flow efficiency | 32 % | 47 % | 85 % |
| Cycle time | 36 days | 36 days | 13 days |
Three epics, same area, similar complexity – and the improvement speaks for itself.
The active work time barely changed. What collapsed was the waiting—grooming, clarification, alignment overhead that was invisible inside sprint velocity.
The caveats: three epics is a signal, not a proof. They weren’t identical in scope. The team was small and I was coaching directly. I’d rather you hear these caveats from me. Three data points aren’t proof; they’re a signal worth investigating.
What I Learned
-
The Blueprint is the new bottleneck — but it’s a better bottleneck.
With SpecKit drafting the first pass, the blank‑page problem is gone. The review still takes real time, and it should—that’s where engineering judgment lives. The developer’s job shifts from “write the spec from scratch” to “validate and sharpen the spec,” which is a better use of their expertise. -
Not everyone wants to write specs.
Resistance collapses after one demonstration. Show a developer AI output from a vague story next to AI output from a good Blueprint. After that, most people write the spec—not because of a process argument, but because it makes their afternoon easier. -
This is kaizen—continuous improvement, from the ground up.
We changed one thing, measured what happened, and kept improving. -
Limits of a single team.
Our flow efficiency hit 85 % within our area. Then an initiative spanning Gaming, Rewards, and Sportsbook arrived, and suddenly our speed didn’t matter. We were blocked by another team’s API, debating event schemas in Slack, and sitting in alignment meetings where six people discussed what two could have decided in a DM.
One team improving means nothing if features get stuck at the boundary. That’s Part 2.
Part 2: “KAIzen Across Boundaries” — coming next week.
Want to try it today?
- Pick your next feature.
- Feed your product brief to an AI assistant and ask it to generate a spec—inputs, outputs, edge cases, constraints, acceptance criteria.
- Refine it.
- Build against it.
- Measure your flow efficiency before and after.
One spec. See what happens.
#ai #agile #softwareengineering #productivity