fb-pixel
Takaisin blogeihin

Burn it down and start again?

The leaders who hear “burn it down” and nod aren’t the ones struggling with AI tools. They’re the ones who’ve been successful enough with AI to see the ceiling. They’ve watched their best engineers become 30% more productive while the team’s overall output stays flat. They’ve seen faster code generation produce more bugs, not fewer. They’ve realised that the problem isn’t the AI. It’s the shape of the container they’re pouring it into.

Why AI doesn’t need a better software delivery lifecycle. It needs a different one.

I’ve been saying “burn it down and start again” a lot lately. In workshops. On calls with engineering leads. In pitch decks to companies spending seven figures on offshore contractors and wondering why their Copilot investment hasn’t changed the maths. It’s a useful provocation. But I’ve started to notice something about when it lands and when it doesn’t, and that gap tells you more about the state of AI in enterprise software than any benchmark or research paper.

The productivity mirage

There’s a number that should haunt every engineering leader investing in AI tooling. In METR’s randomised controlled trial, experienced developers using AI on their own mature codebases were 19% slower than those working without it. Not new graduates fumbling with prompts. Seasoned engineers with years of experience on their own repos, choosing when and how to use AI.

The more disturbing finding is that those same developers estimated AI had made them 20% faster. They were confidently wrong, and wrong in the opposite direction.

Most people cite this study as evidence that AI tools don’t work. I think it shows something more specific and more useful. AI tools don’t work when dropped into processes designed around different constraints.

The METR result tells us less about what AI can do and more about what happens when you force a new technology through a process that was built around completely different constraints.

Consider what actually happened in those sessions. Developers spent significant time crafting prompts, waiting for responses, then reviewing and cleaning up output in codebases where they already knew exactly what to write. The AI was effectively inserting a slow, unreliable middleman into a workflow the developer had already optimised through years of practice. That’s not a tool failure. It’s a design failure. We asked AI to accelerate a process built around a human’s fingers on a keyboard, when AI’s actual strength is something entirely different: operating on explicit specifications within well-defined contexts.

GitClear’s data tells the same story from a different angle. Code churn has nearly doubled since AI tools became widespread. Code duplication is up eightfold. The AI is generating code the way a contractor with no institutional context would, adding new things rather than integrating with what exists. And that’s exactly the situation we’ve put it in. We’re handing an AI a user story written for a human who attended last Tuesday’s standup and absorbed six months of Slack conversation about why the authentication module works the way it does, and we’re surprised when it produces something that technically works but architecturally makes no sense.

AI as an organisational x-ray

Here’s something I didn’t expect to see when I started this work. AI adoption is the most effective organisational diagnostic I’ve ever seen.

Every dysfunctional pattern an engineering org has been compensating for through tribal knowledge, heroic individual contributors, undocumented workarounds, and meetings that exist solely to transfer context becomes immediately, painfully visible the moment you try to hand that work to an AI agent.

The AI tool can’t attend your standup. It doesn’t have access to your Slack history out of the box. It doesn’t know that the reason the payment module has that strange conditional is because of a regulatory change three years ago that nobody documented but everyone “just knows”. Knowledge exists in fragmented conversations, stand up notes, jira tickets with no structured way to access this valuable context. When you try to get AI to work on a codebase like this, it fails. And in failing, it maps every knowledge gap, every implicit assumption, and every process that only works because Sarah from the platform team has been there since 2019. This is why “just add Copilot” doesn’t work, and it’s also why the solution isn’t “better AI.” The solution is making implicit knowledge explicit. Writing it down. Structuring it. Making your organisation’s collective intelligence machine-readable.

AI exposes a knowledge architecture gap and most organisations don’t have a knowledge architecture at all. At best, they have a collection of people who remember things.

Organisations with and without a structured knowledge layer

This reframing changes your entire ROI conversation. The real value of an AI-native transformation is that you finally codified the institutional knowledge that was trapped in people’s heads, and now both humans and AI can use it. That’s valuable regardless of what AI does next. If every AI model disappeared tomorrow, an organisation with a proper knowledge layer would still be dramatically better off than one running on tribal knowledge and hallway conversations.

The specification is the product

There’s a mental model shift that separates companies getting real results from AI and those still waiting for the productivity numbers to appear.

In a traditional SDLC, the code is the product. Everything upstream (requirements, specs, design docs) is overhead that exists to make the code better.

In an AI-native SDLC, the specification is the product. The code is a downstream artefact that can be generated, regenerated, and thrown away.

This sounds abstract until you see it in practice. When a team shifts to spec-driven development, the quality of their thinking has nowhere to hide. A vague user story that a skilled human developer could interpret through context and intuition becomes a broken input that produces broken output when fed to an AI agent. AI is brutally honest about the quality of your requirements. It can’t read between the lines. It can’t infer what you meant from what you said. If your spec says “improve the user experience,” the AI will do something. It just won’t be what you wanted.

This forces an uncomfortable upgrade in the precision of upstream thinking. Product managers who’ve spent careers writing intentionally vague requirements, because vagueness creates flexibility for the human developer to exercise judgment, now need to write specs that are precise enough for AI consumption while still leaving room for architectural decisions by the senior engineers reviewing the output. That’s a genuinely new skill, and it’s harder than it sounds.

There is an upside though: once you have those precise specifications, you can version them, diff them, evaluate them, and improve them independently of the code. The spec becomes a testable, iteratable artefact. You can run the same spec through different models and compare output. You can change the spec and regenerate. The feedback loop tightens dramatically, but only if the spec is good enough to be the unit of work.

In the old SDLC, we managed code. In the new one, we manage context. The code is a side effect.

What actually needs burning

So here’s where I’ve landed. The answer to “should you burn it down?” is yes, but be precise about what you’re setting fire to.

Burn the assumption that faster code means faster delivery

The bottleneck in most engineering organisations was never typing speed. It was decision latency, rework from misunderstood requirements, and knowledge transfer overhead. AI accelerates code generation, but code generation was rarely the constraint. If you optimise the wrong bottleneck, you just create a faster path to the same rework cycle. The evidence is already clear. More code, faster, without better context produces more churn, not more value.

Burn the assumption that documentation is overhead

In an AI-native world, documentation is infrastructure. Your CLAUDE.md/Agent.md files, architectural decision records, coding standards, domain glossaries. These aren’t nice-to-haves. They’re the context layer that determines whether your AI agents produce architecturally coherent output or generate code that technically works but structurally degrades your system over time. The organisations that treated documentation as a cost centre are now discovering it’s the most critical input to their most expensive tools.

Burn the assumption that the developer’s job is to write code

The emerging role of the senior developer looks more like a technical lead managing a team of unreliable but fast junior contributors. You define the approach, set the constraints, review the output, and intervene when things go off track. The core skills shift from implementation speed to specification quality, context engineering, and architectural judgment. It’s actually the job most senior engineers always wanted, with less boilerplate and more design. But it requires a completely different workflow, different tools, and different performance metrics.

Burn the assumption that you can transform incrementally without a clear target

Incremental improvement works when you’re optimising within a paradigm. It fails when you’re shifting between paradigms. You can’t iteratively arrive at spec-driven development from a process that treats user stories as the primary unit of work. You can’t build a knowledge layer one Jira ticket at a time. You need a vision of the target state, even a rough one, and then a deliberate, contained experiment to validate it. Pick one team. Build the new workflow. Measure it. Prove it. Then scale.

Don’t burn the things people undervalue

Equally important is knowing what to protect.

Don’t burn your senior engineers’ domain knowledge. Codify it instead. Their understanding of why things are the way they are is the raw material for your knowledge layer. The companies that laid off experienced engineers to replace them with AI-augmented juniors are already discovering that the juniors don’t know what “good” looks like, and neither does the AI.

Don’t burn your quality culture. Sharpen it instead. AI-generated code needs more rigorous review, not less. The review criteria just shift from “does this work?” to “does this integrate with our system’s architecture, respect our conventions, and avoid introducing the kind of duplication that will cost us in six months?”

Don’t burn your delivery commitments. This transformation has to happen alongside real work, not instead of it. The best approach I’ve seen is embedding it in one team’s actual delivery, so the new process is being tested against real constraints, not theoretical ones.

The software industry is going through a paradigm shift where the fundamentals of how software gets built is changing. And it’s becoming increasingly clear that the process we built for human-only development is the wrong shape for human plus AI (and eventually AI-only) development.

The companies that figure this out first get a compounding advantage. They build the organisational infrastructure that makes these tools effective. The knowledge layer. The specification discipline. The context engineering capability. The evaluation frameworks. These compound over time in ways that tool access alone never will.

Author

  • Aarushi Kansal
    AI Tech Director, UK