Production Hardening: The Boring Part Nobody Talks About

The receipt scanner side project was feature-complete. It worked on my machine. The OCR pipeline hit 97.78% confidence. The frontend felt snappy. The backend handled edge cases. I was proud of it.

Production readiness checklist: eight things AI won't tell you to add

It was nowhere near production-ready.

This is the gap nobody talks about in the AI acceleration conversation. Everybody celebrates how fast AI builds features. Nobody celebrates the grinding session where you bolt on all the things that keep a public-facing app from being abused, exploited, or melting under real load.

I spent one session doing nothing but production hardening. Here’s what I actually added.

What “Feature Complete” Is Missing

Rate limiting. Every public endpoint was wide open. An attacker — or just a curious person with a script — could hammer the OCR endpoint indefinitely. Added per-IP rate limits, different tiers for authenticated vs. anonymous users, and a reasonable cooldown for repeated failures.

Two-tier CORS configuration. The frontend needed CORS access. But the API also had internal endpoints that should never be reachable from a browser at all. Treating these identically was lazy. Proper setup means separate CORS policies for the public frontend routes and the internal API routes.

Error message sanitization. The default error handling was leaking stack traces in API responses. In development, that’s useful. In production, that’s an invitation. Every unhandled exception was telling the world exactly which library version we were running, which file path threw, and sometimes which database column didn’t exist. Sanitized all of that to generic error codes with internal logging.

Graceful shutdown. The Docker container, when killed, was dying mid-request. No drain period, no connection cleanup, just death. For a Kubernetes deployment, this means dropped requests on every deploy. Added shutdown hooks to stop accepting new work, wait for in-flight requests to complete, then exit cleanly.

Gzip compression. Receipt images and OCR response payloads were going over the wire uncompressed. Obvious in retrospect. Added compression middleware and immediately cut response sizes significantly.

Disposable email blocklist. The app had user registration. Without a blocklist, someone can spin up 500 accounts with throwaway addresses in minutes. Added a blocklist of 121,000 known disposable email domains. Not perfect, but it raises the cost of abuse substantially.

Basic Prometheus monitoring. I had logs. I did not have metrics. There’s a difference. Logs tell you what happened. Metrics tell you whether things are trending the wrong direction before they break. Added standard instrumentation: request counts, latency histograms, error rates, active connections.

OpenAPI documentation. Not strictly a hardening concern, but part of making a service legible to the outside world — including future-me. Generated documentation from the actual route definitions rather than maintaining separate docs that would inevitably drift.

Then Came the Docker Issues

The containerized build broke immediately. ES module resolution failed across 19 files. The app ran fine locally because the local Node version was lenient; the Docker image was running a different version that was strict. Tracked down every file, fixed the module syntax, rebuilt.

Then the demo server ran out of disk space. 14 gigabytes of accumulated Docker image layers, old build artifacts, and log files. The application failed silently because it couldn’t write temp files. Nothing in the logs explained why OCR was returning blank results — just generic I/O errors. Once we traced it back to disk exhaustion, it was obvious. But it took time to diagnose.

The Part That Actually Matters

Here’s what I noticed throughout this entire session: the AI agent handled every single one of these tasks competently. Rate limiting configuration — done. CORS setup — done. Disposable email integration — done. Prometheus metrics — done. Docker fixes — done.

But the agent never once said “hey, you should add rate limiting.” Or “you’re leaking stack traces.” Or “your Docker container doesn’t shut down gracefully.”

I had to know to ask.

Every single item in that list came from me knowing what a production-ready backend looks like. The AI executed. I directed. The engineering judgment — the awareness that these things exist and matter — never transferred from me to the agent.

This is the part of the AI acceleration story that gets elided. The velocity is real. A junior developer without this mental checklist would spend weeks adding what I added in one session. But they wouldn’t add it at all without someone telling them what to look for.

Human direction, AI execution: the judgment gap

What This Means for You

If you’re building with AI agents, maintain your own production-readiness checklist. Not as a bureaucratic exercise, but as a knowledge artifact. The things you’ve learned from past production incidents, from code reviews, from reading postmortems — that knowledge lives in your head. AI doesn’t have it unless you surface it explicitly.

The checklist I now keep covers: authentication and authorization, input validation, rate limiting, error handling, logging and monitoring, graceful shutdown, dependency security, CORS and CSP headers, data backup and recovery.

Each line represents something that broke in production at some point — mine or someone else’s. AI won’t discover these lessons for you. You have to bring them.

The tools are faster than ever. The judgment still costs experience. (The larger context for this — building a complete app in one session and discovering CORS was missing from the first live deployment — is in $187 and 16 Hours.)

What “Feature Complete” Is Missing

Then Came the Docker Issues

The Part That Actually Matters

What This Means for You

What the models think