The 97% Bundle Cut: Why AI Agents Need Human Expertise

An AI agent built the blog system for this site. It chose the libraries. It set up the content pipeline. It wrote the markdown loader, the rendering layer, the routing. The code was clean, well-structured, and followed every best practice for a React blog.

97% bundle reduction: from 161kB to 4.97kB

It also shipped a ticking time bomb.

Not a bug. Not a crash. Something worse — an architecture that works perfectly today and degrades silently over time. The kind of problem that no agent will ever flag, because by every metric the agent cares about, the code is correct.

A human caught it. Not because humans are smarter than agents at writing code — they’re not. But because humans carry something agents don’t: the experience of having watched systems fail in slow motion.

What the Agent Built

The blog system used import.meta.glob to load markdown files at build time and react-markdown to render them in the browser. This is the standard approach. Every React blog tutorial teaches it. It’s the first result on Stack Overflow. It’s what an agent trained on millions of codebases would naturally reach for.

And it works. At three posts, the entire blog system adds about 220kB to the JavaScript bundle. Invisible on any modern connection. All tests pass. Lighthouse is green. The agent ships it, reports success, moves on to the next task.

Here’s what the agent doesn’t see: that 220kB will become 440kB at six posts. 2MB at thirty posts. At a hundred posts, every visitor downloads a megabyte of markdown content they’ll never read, plus a full markdown compiler to parse it — even though the output is the same on every page load, for every visitor, forever.

The architecture is correct. The trajectory is catastrophic. And nothing in the feedback loop — not tests, not linting, not type checking, not code review — would ever catch it.

Why the Human Caught It

I’ve seen this pattern before. Not this exact code, but this exact shape of problem. A system that works perfectly at small scale and degrades linearly. A dependency that’s invisible at first and dominant later. An architecture that nobody questions because it shipped on time.

I’ve been the person debugging a 4MB bundle at 2am, tracing it back to a “reasonable” decision made eighteen months ago by someone who’s no longer on the team. I’ve watched performance budgets erode week by week, 10kB at a time, until someone finally notices the Lighthouse score is orange and the fix requires rewriting half the frontend.

That pattern recognition isn’t something you can write in a prompt. It’s not a rule you can encode in a CLAUDE.md file. It’s scar tissue from years of building and maintaining systems at scale.

When I looked at the blog architecture, I didn’t see a bug. I saw a trajectory. And I asked a question the agent would never ask: “What happens when we have a thousand posts?”

The Conversation That Fixed It

This is what agentic engineering actually looks like. Not “human writes prompt, agent writes code.” A conversation where each side contributes what it’s uniquely good at.

Human (me): “This eager loading won’t scale. Why is markdown being parsed at runtime? The content is static.”

That’s the intent. Three sentences. No implementation details. No code. Just a human who recognized a problem and articulated what’s wrong.

Agent: Plans the solution — a build-time script that pre-renders markdown to HTML, generates a lightweight metadata index, and serves everything as static JSON. Content fetched on demand, not baked into the bundle.

That’s the execution. The agent designed the architecture, wrote the build script, refactored the data layer from synchronous to async, added pagination, updated the sitemap generator, and removed the runtime dependencies — all from a directional prompt.

Human (me): “This means raw markdown is exposed as static files. Anyone can curl them.”

That’s judgment. The agent didn’t flag this because it wasn’t asked about security or content exposure. It was solving the performance problem. I noticed the side effect and raised it.

Agent: Adjusts the approach — pre-render to HTML at build time, serve as JSON instead of raw markdown. No source files exposed.

The result:

	Before	After
BlogPost chunk	161 kB (48.9 kB gzip)	4.97 kB (1.73 kB gzip)
Runtime dependencies	react-markdown, remark-gfm, micromark	None
Content loading	All posts baked into JS	One fetch per post
Scales to 1000 posts?	No	Yes

A 97% reduction. Not from a clever optimization. From a human asking the right question and an agent executing the right answer.

The trajectory problem: bundle grows linearly with posts before fix, stays constant after

What the Agent Can’t Do

The agent is better than me at writing code. Faster, more consistent, fewer typos, broader API knowledge. If I described the exact refactor I wanted — “replace import.meta.glob with a build script, use marked for HTML conversion, serve as static JSON” — the agent would have built it perfectly.

But I would never have needed to describe it if the agent had seen the problem. And the agent didn’t see the problem because it can’t:

Project forward from experience. The agent evaluates code against patterns. It doesn’t simulate the future state of a codebase with 100x more content and ask “does this still work?”

Feel architectural friction. A senior engineer looks at eager-loaded content in a JavaScript bundle and feels something is off. That feeling comes from debugging production incidents, not from pattern matching on training data.

Challenge its own defaults. The agent picked react-markdown because it’s the most common solution. Popularity is a strong signal in training data. But popular doesn’t mean right, and “most common” is often “most convenient for a tutorial” rather than “best for production.”

Notice gradual degradation. The system works today. The agent has no mechanism to evaluate “works today but fails in six months.” It optimizes for the present, not the trajectory.

What the Human Can’t Do (Efficiently)

The flip side is equally important. I caught the problem, but I couldn’t have fixed it in the time the agent did.

The refactor touched six files, introduced a new build script, converted synchronous APIs to async, added pagination, updated the sitemap generator to use the new index format, and removed two runtime dependencies. The agent did this in one session. It would have taken me a full day — not because it’s hard, but because the mechanical work of reading files, understanding interfaces, making consistent changes across multiple modules, and testing everything is exactly what agents excel at.

The human-agent split was clean:

Human: “This won’t scale” → “What about exposed source files?” → “Add a CI check for bundle size”
Agent: Plan architecture → implement build script → refactor data layer → update components → remove dependencies → verify build

Three human inputs. Hundreds of lines changed by the agent. That’s the leverage ratio of agentic engineering — not replacing expertise, but amplifying it.

The Lesson Nobody Wants to Hear

There’s a seductive narrative around AI agents: they’ll make expertise obsolete. Anyone can build anything. Just describe what you want.

The blog system proves the opposite. An agent without human oversight would have shipped an architecture that works for a year and becomes a crisis the year after. Not because the agent is bad — because the agent is excellent at building what you ask for and incapable of questioning whether what you asked for is right.

The agents built the blog. A human saved the blog from itself.

That’s not a failure of AI. That’s the whole point of human-agent collaboration. The human carries the intent, the judgment, the experience, the pattern recognition from a decade of watching systems evolve. The agent carries the speed, the precision, the breadth of knowledge, the tirelessness.

Neither is sufficient alone. Together, they caught a problem at three posts that would have been a painful rewrite at three hundred. They fixed it in an hour instead of a week. And they left behind a CI check — a bundle size budget that fails the deploy if the total JavaScript exceeds 150kB gzipped — so the system now watches its own trajectory automatically.

That last part matters. The human recognized the need for a guardrail. The agent implemented it. And now the system protects itself going forward, without either of them needing to remember.

That’s architectural evolution. Human intent, agent execution, permanent protection. The kind of thing that only happens when expertise meets capability. (This same judgment gap — where AI executes and humans must direct — appears in Perfectionism Meets Agentic Migration as well.)