In 2016, I was part of a fifty-person AI team at a Japanese e-commerce giant – not a research lab, a production floor. Millions of daily requests. Real models running in production, not on slides. The kind of infrastructure that does not forgive a missing dependency.
What I remember most from that period is not the models. It is what surrounded them: the distributed Java services, the Kafka pipelines, the compliance constraints built for dozens of markets, the monitoring systems that had to catch failures before the models could amplify them. The architecture that made the AI possible in the first place.
Later, at a global clothing brand, I saw a different kind of AI: RFID-tagged products moving through automated warehouses, intelligent logistics extending what the operations team could do rather than racing to replace them overnight. What made it work was not the sophistication of the models. It was the clarity of the boundaries. The system knew exactly where it ended and where human judgment began.
I have been thinking about both of those years since October 2025.
The Number
Microsoft and Meta have each stated that AI agents are now responsible for generating or managing around thirty percent of live production code. The claim lands with real weight.
But when I hear it, I do not picture the line of code. I picture the system around it.
An agent can write a function. It can write a test suite. It can refactor a module and propose a migration path. What it cannot do is understand the dependency it has never seen – the one buried four services deep, written by a team that no longer exists, the one that only surfaces when a region goes dark.
On October 20, 2025, Amazon’s us-east-1 region suffered a major outage lasting approximately fourteen hours. The root cause: a DNS automation bug tied to DynamoDB’s endpoint within that region. One automation dependency. One propagation path. More than a hundred services affected globally. The cascade erased connectivity and transaction flows across systems that had no obvious link to the source of the failure.
It was not a code failure. It was an architecture failure.
What the Agent Never Knew
Every system has an architecture whether you designed it or not, whether you know it or not.
That is the sentence I keep returning to. The AI agent writing code within a system has no map of the dependency graph it is operating inside. It produces correct code for the requirements it was given. The requirements said nothing about a DNS propagation failure three layers below. They rarely do.
The DynamoDB endpoint failure in us-east-1 cascaded because the systems that depended on it did not fully understand how they depended on it. That knowledge was in the infrastructure, not in the code. An agent generating thirty percent of that codebase had no way to know. Neither, perhaps, did many of the engineers who had built on top of those dependencies for years without ever needing to understand them at that depth – until the outage made the dependency visible by removing it.
Architecture is not the sum of the modules. It is the shape of the decisions: where the boundaries are, which failures are acceptable, how state moves across a distributed system, what happens when a region goes offline. Those decisions live in people and in history. They are not in the codebase the agent can read.
At the Japanese e-commerce giant, every AI component had a defined fallback path. Not because we expected failure constantly, but because at millions of requests per day across global markets, the expected case eventually becomes the exceptional case. That was architecture as a leadership decision – never in a single file, always in the shape of the whole system.
The AWS outage did not expose a single bad agent. It exposed what happens when automation runs ahead of the understanding of the system it is modifying. Legacy systems with deep language constraints, embedded in proprietary frameworks, built around transactional models that have evolved for years – they do not accept agentic coding just because the tooling is ready. Readiness is an architectural question, not a tooling one.
Where Agents Actually Belong
Partial automation is not a compromise. It is the correct design.
Refactoring and dependency mapping is where agents are genuinely strong. They can read a legacy module, document its interfaces, identify the hot zones for modernisation. This takes human engineers weeks. A well-directed agent returns a better map in hours. The architect still decides what to do with it.
At the edge of legacy systems, agents can build new wrapper services, new APIs, new test harnesses – while the core stays under human oversight. The boundary between the new and the old is not a tool decision. It is an architectural one. Designing for loads you have not yet met means deciding what the agent touches and what it does not, before the load arrives.
Test automation and observability is another strong fit: agents generate coverage and catch anomalies early. What they cannot define is what to measure – which SLIs matter to the business, which failure modes are worth alerting on, what the runbook says when the alert fires at 2 a.m. Observability is architecture. It is not something a model discovers by reading the source.
The model that holds at scale is a duo: humans design and define the boundaries; agents generate code under constraints; humans review and run it. Not a handoff. A collaboration where one side holds the system knowledge and the other holds the throughput.
And underneath all of it: build for multi-region failover, assume dependencies break, run chaos scenarios. The agent that writes code for the happy path is not your resilience strategy. October 2025 was a reminder that even infrastructure designed for planetary scale can encounter a failure no automated playbook predicted.
The Other Seventy Percent
The companies announcing thirty percent AI-generated code are not wrong about the number. They may be underestimating what the other seventy percent carries.
That seventy percent is not just human-written code. It is human-held knowledge about the systems the code runs inside – the institutional memory that knows where the boundaries are, why that service was written the way it was, and what breaks if you change the timeout threshold. When an agent writes thirty percent without access to that knowledge, the thirty percent becomes permanently dependent on the seventy percent of human understanding still being present to catch the failures.
There is also a question rarely asked in the same breath as the thirty percent claim: if agents are gradually displacing the engineers who understood that code, who accumulates the system knowledge that makes the agents’ output safe? The announcement of thirty percent also says something about the remaining seventy. If the work volume holds constant but the headcount shrinks, the system knowledge per engineer rises – until the engineers who hold it leave. When they leave, they take the map with them. The code remains. The understanding does not.
I watched AI grow from fifty engineers in a production team in Osaka to a capability that could now fill a third of a company’s commit log. The trajectory is remarkable. What I also saw in both of those earlier environments – long before anyone used the word “agentic” – is that the engineers who understood the whole system were the ones who made the AI worth anything. Not because they wrote more code. Because they knew what the code was operating inside.
Automate slowly, safely. Architect for failure, not just features. Preserve the operational awareness that no model has been trained to carry.