Why Bolting AI onto Old Structures Causes Outages

CloudScale SEO — AI Article Summary

What it is	This article examines why traditional organizational structures fail when AI is bolted onto existing human-paced processes, using Amazon's recent outages from AI-generated code as a case study.
Why it matters	Organizations trying to govern AI output with human review processes face a fundamental speed mismatch that creates the illusion of oversight without actual safety, leading to production failures.
Key takeaway	You cannot effectively govern AI-speed processes with human-speed oversight - only AI-powered governance can match AI-powered development.

There is an old urban legend, immortalised as one of the original Darwin Award nominations, about a man who bolted a JATO unit to a 1967 Chevrolet Impala. JATO stands for Jet Assisted Take Off. It is a solid fuel rocket designed to give heavy military transport aircraft the extra thrust they need to leave a short runway. The story goes that he drove out into the Arizona desert, found a long straight road, and fired it. The car reached speeds in excess of 350 miles per hour within seconds. The brakes melted. The tyres disintegrated. The car became airborne for over a mile and impacted a cliff face at 125 feet, leaving a crater three feet deep in the rock. The remains were not recoverable.

The legend is almost certainly fictional. The lesson it contains is not.

1. Amazon Reached for the Brakes

TechRadar reported this week that Amazon has responded to a series of high-profile outages by mandating that AI-assisted code changes receive sign-off from senior engineers before deployment. A six-hour disruption to its main ecommerce platform. Attributed internally to what communications described as “Gen-AI assisted changes.” Amazon SVP Dave Treadwell acknowledged that site availability had “not been good recently.”

The response is understandable. It is also the wrong answer.

Adding human sign-off to AI-generated code is not a governance strategy. It is a reflex. And like most reflexes, it feels right in the moment and solves the wrong problem. The driver reached for the brakes. The brakes had already melted.

2. You Can Only Tune a Go-Kart So Far

Think about what it actually means to optimise an existing organisation for AI. You add tooling. You write policies. You create centres of excellence. You require approvals. Each of these interventions makes you feel like you are responding to the challenge. Some of them even work, up to a point.

But there is a ceiling. Every go-kart has one.

You can tune the engine, lower the chassis, upgrade the tyres and find a better driver. You will go faster. At some point, however, you have extracted everything the vehicle was designed to give. The frame was never engineered for these speeds. The steering geometry was never intended for this kind of load. The braking system was sized for a completely different performance envelope. You are not tuning the vehicle anymore. You are fighting its fundamental architecture.

If you want to break the sound barrier, the Impala is the wrong starting point. It was never designed for this. Starting from it is not a constraint you can engineer around. It is the problem.

Most organisations adopting AI are doing exactly this. They are bolting a JATO unit to an organisational structure built for human-paced software delivery, human-scale code review, and human-readable output. The structure has approval gates built for humans. Governance processes built for humans. Risk frameworks built for humans. Quality assurance functions staffed by humans operating at human speed. And then they fire the rocket.

3. The Sign-Off Illusion

Here is the specific failure mode that Amazon’s response illustrates.

When AI is generating code at scale, the volume and complexity of that output quickly exceeds what any human reviewer can meaningfully evaluate. A senior engineer reviewing an AI-assisted pull request is not really reviewing it. They are scanning it. They are applying pattern recognition. They are looking for things that look wrong, which is a very different cognitive task from understanding what the code actually does and whether it is correct.

This matters enormously when the code is AI-generated. AI-generated code does not fail in the ways human-generated code fails. Human engineers make mistakes that are recognisable to other human engineers. The errors have shapes that experienced reviewers have seen before. AI-generated errors are structurally different. They can be syntactically perfect, pass linting, pass unit tests, and still encode a subtle misunderstanding of the problem domain that only surfaces under specific production conditions. Exactly the conditions that caused a six-hour outage.

Requiring a senior engineer to sign off on a 30,000-line AI-generated pull request is not oversight. It is the performance of oversight. Nobody in that review chain actually understands what the AI has done. They are approving it anyway. Because what else can they do. The rocket is already firing. The brakes are ornamental.

4. The Disconnection Risk and the Context Window Problem

4.1 The Atrophy Risk

There is a second failure mode that is slower, quieter and more dangerous than throughput. It is disconnection.

Senior engineers carry something that no AI model currently has. They carry a wide context window built from years of operating the system they are reviewing. Not the code in the PR. The system. They know why the retry logic in the payments service was written the way it was. They know what happens to that message queue at month end under peak load. They know the three things you must never do with that database connection pool, because two of them caused incidents that they personally stayed up until 3am to resolve. That knowledge is not written down anywhere. It lives in the engineer.

When AI writes the code and humans only scan the output, that knowledge stops being exercised. It atrophies. Slowly at first. Then faster. And the organisation does not notice until the moment it needs that knowledge most, which is the moment the system is on fire and nobody in the room can explain why.

4.2 The Context Window

Here is something worth understanding about the difference between how humans and AI reason about code. AI has a token window. It is large and getting larger. But it is still a window over the text of the code itself. It does not have a window over the operational history of the system, the incident reports, the architectural decisions that were made and reversed, the subtle coupling between services that was never documented because everyone who built it already knew.

Humans have that window. A senior engineer reviewing a change to the payments flow is not just reading the diff. They are reading it against a mental model of everything that system has ever done wrong. That mental model is irreplaceable. It is also fragile. Use it or lose it.

When AI generates the code and humans only approve the output, the mental model stops being updated. Engineers drift from their systems. The context window narrows. And when idempotence violations, race conditions and cascading failures eventually surface in the RCA, the people in the room are reading the evidence without the intuition needed to interpret it.

4.3 Compound Complexity

AI makes errors. This is not a criticism. It is a fact that any honest assessment of current AI coding capability has to start from. The errors are not random noise. They are systematic. They reflect misunderstandings of intent, of operational context, of the constraints that exist outside the code itself. And they compound.

A block of AI-generated code that handles retries makes an assumption about idempotence. Another block that handles concurrency makes a different assumption about state. Each block is locally plausible. Each would pass a unit test. At the integration seam, the assumptions conflict, and the failure mode is invisible until the system is under the specific combination of load and timing that exposes it. You do not find this in a code review. You find it in a production incident at 2am.

Defending against compound complexity requires testing that is specifically designed to find the failures that live at integration boundaries. Not unit tests. Not happy path integration tests. Provocative, adversarial, edge to edge tests that assume AI has made plausible errors in every block and attempt to trigger the interactions between them. This test suite has to fire on every checkin. It has to be treated as a first-class engineering product. It is the only defence you have against a system that nobody fully understands being assembled from components that each individually looked fine.

4.4 The RCA You Cannot Read

Senior engineers have a context window that AI does not. It is built from years of watching the system fail. They know the race conditions that were fixed in 2019. They know why that service cannot be called twice on the same transaction. They know the failure modes that live in the gaps between components, not inside them.

When AI writes the code and humans only review the output, that knowledge stops being exercised. It atrophies. The context window narrows. And the engineers do not know it is happening until the RCA, when they are staring at a cascade they cannot explain because they have not been close enough to the system to see it coming.

AI does not see integration risk. It sees the block in front of it. The errors it makes are plausible in isolation and catastrophic in combination. Idempotence violations. Race conditions. Thundering herds triggered by a single timeout. You cannot find these by reading code. You find them with comprehensive, adversarial automated testing that fires on every checkin and is specifically designed to trigger the failures that live at the seams.

4.5 Protect the Context Window

The most valuable thing a senior engineer brings to a code review is not their ability to read code. It is their ability to read code in the context of everything they know about the system it is entering. Those are completely different skills. The first can be replicated. The second cannot, at least not yet, and not cheaply.

AI assembles code from a window over the text in front of it. A senior engineer reviews code from a window over years of operational history, incident reports, architectural regrets and hard-won intuitions about where this particular system fails under pressure. The human context window is wider, deeper and enormously more valuable than it looks from the outside.

It is also the first thing to go when engineers stop being close to their systems. Replace writing with reviewing. Replace reasoning with scanning. Do it long enough and the wide context window collapses into a narrow one. The engineer is still senior in title. They are no longer senior in the way that actually matters at 2am when something has gone wrong and nobody can explain why the cascade started where it did.

Protect the context window. Keep senior engineers close to their systems. Treat their deep operational knowledge as the risk control it actually is, not as background context for an approval process. And build automated testing that is adversarial enough to find the compound errors that AI will inevitably introduce at integration boundaries, because the human intuition that used to catch those errors is the thing you are most at risk of losing.

5. Governance Cannot Run at Rocket Speed

The deeper problem is one of tempo. Human governance processes were designed for human delivery tempos. When a team ships once a fortnight, a review board can function. When a team ships multiple times a day, the review board becomes a queue. When AI agents are generating and deploying code continuously, the review board becomes an illusion. A compliance checkbox that adds latency without adding safety.

This is not a criticism of the people involved. It is a systems problem.

You cannot solve a throughput mismatch by asking the slower component to work harder. You can ask senior engineers to approve more PRs per day. They will try. The quality of each review will degrade proportionally. This is not a failure of diligence. It is mathematics.

The organisations that understand this are not adding more humans to the approval chain. They are asking a more uncomfortable question. If the output is moving too fast for humans to govern, what can govern it?

6. Only AI Can Stabilise AI

The answer is not comfortable for people who believe that meaningful oversight must be human. But the logic is unavoidable. If AI is your accelerant, humans cannot be your brake. The physics do not work. You need a brake that operates at the same speed as the engine.

The antagonist muscle has to be AI itself. Automated testing at a scale and depth that matches AI-generated output. AI-powered quality assurance that can actually read and reason about what an AI agent has produced. Continuous evaluation frameworks that catch behavioural drift in production before it becomes an outage. Canary deployments and automated rollback systems that do not wait for a human to notice something is wrong.

None of this replaces human judgment. Humans set the standards, define the acceptance criteria, interpret the metrics and make the strategic decisions. But the execution of quality assurance at the speed of AI-generated delivery has to be automated. There is no alternative that is not either a bottleneck or a fiction.

7. The Real Darwin Award

Amazon is a sophisticated technology organisation and they will work through this. They have the engineering talent, the operational discipline and the financial resources to find a better answer than mandatory sign-off. The companies that concern me are the ones that do not have those resources and are adopting AI at the same pace without asking any of these questions.

The JATO award goes to the organisation that invests heavily in AI as an accelerant, bolts it to their existing delivery structure, adds a sign-off process to feel responsible, and then discovers eighteen months later that they have a production environment that nobody fully understands, an incident rate that is climbing, and a governance process that never actually worked.

The moment of discovery is the cliff face.

Responsible adoption of AI is not slow adoption. Speed is not the problem. The problem is asymmetry. Investing heavily in the accelerant and almost nothing in the braking system. Every dollar your organisation spends on AI-generated code commits should be matched by investment in automated testing, quality metrics, A/B evaluation frameworks, behavioural monitoring and rollback capability. Not because regulators require it. Because the alternative is a crater.

8. Start From the Right Vehicle

The organisations that will navigate this well are not the ones that slow down AI adoption. They are the ones that redesign the vehicle before they fire the rocket. They ask what an engineering organisation looks like when AI is a first-class participant rather than a tool used by humans. They rebuild their quality assurance function from the ground up with automation at its core. They define what good looks like in machine-readable terms, not just human-readable ones. They treat the testing and evaluation pipeline as a product in its own right, not an afterthought.

This is harder than adding a sign-off step. It requires accepting that the existing structure was not designed for this and cannot be tuned to cope. It requires building something new rather than patching something old. It requires the kind of uncomfortable organisational honesty that most companies find very difficult.

But the alternative is the Impala. Firing the JATO unit on a vehicle built for a different world. Watching the brakes melt. Hoping that someone in the approval chain noticed something in that 30,000-line PR.

They did not. Nobody could have. The crater is already in the cliff.

9. References

TechRadar, Craig Hale, 11 March 2026. “Amazon is making even senior engineers get code signed off following multiple recent outages.” https://www.techradar.com/pro/amazon-is-making-even-senior-engineers-get-code-signed-off-following-multiple-recent-outages
Wikipedia. “JATO Rocket Car.” https://en.wikipedia.org/wiki/JATO_Rocket_Car

👁263views
The JATO Organisation: Why Bolting AI onto Your Existing Structure Is a Darwin Award in Progress