Three-Tier Memory: How I Taught My AI to Remember

Three-tier memory architecture: STATUS.md, Journals, and Facts

Every AI session starts at zero. Blank slate. The AI you had yesterday — the one that understood your codebase, remembered last week’s decision to avoid a particular pattern, knew why you made that architectural trade-off in session 3 — gone. You spend the first 20 minutes re-explaining context that you already explained, watching the agent make the same tentative first steps it made last time.

This bothered me enough that I built something to fix it. Not a fancy vector database. Not a RAG system. Something much simpler: a three-tier memory architecture that treats AI context like a computer treats data.

The Three Tiers

Tier 1: STATUS.md — RAM

STATUS.md is the agent’s working memory. A single file, overwritten at the start of every session. It contains: current project state, active work items, recent decisions, open questions, and anything else the agent needs to pick up where we left off.

The goal: full context rehydration in under 60 seconds. An agent that reads STATUS.md should be able to answer “where are we?” without asking me anything.

This is like RAM — fast, volatile, always current, always replaced.

Tier 2: Daily Journals — Hard Drive

At the end of each session, the agent writes a journal entry. Append-only, never edited after creation. Each entry records what was shipped, what decisions were made, what was learned, and what’s still unclear.

The journals are the durable record. If STATUS.md is lost or stale, you can reconstruct it from the journals. They’re not meant for daily reading — they’re the archive you go back to when you need to understand why something is the way it is.

Tier 3: Facts — Compiled Wisdom

Facts are patterns distilled from journals. The rule: if something appears in 3 or more journal entries, it gets promoted to a fact. A fact is permanent, concise, and owned by the project — not by any single session.

Examples of what becomes a fact: “Maven always needs the -pl module flag or the build runs the wrong thing.” “The CI pipeline treats warning-level Checkstyle violations as errors — don’t downgrade the rule, fix the code.” “When the type-checker complains about nullable optionals, the underlying issue is usually a missing null check two layers up.”

These aren’t things you want to rediscover every session.

The Compression Insight

The most important realization: rehydration over narration.

The naive approach is to tell the AI the whole story. “We started in session 1, decided to use this architecture, hit a problem with X in session 4, solved it by doing Y…” That’s slow and token-expensive, and it puts the agent in a passive listening mode instead of an active working mode.

The better approach is compression. STATUS.md doesn’t tell a story. It presents a state. Five bullet points instead of five pages. Current facts about the world, not the history of how we got here.

On a project I’d been running for several weeks, I measured this: the full journal history was over 50,000 words. STATUS.md was 4,000 words. The agent could restore effective context from STATUS.md alone. That’s a 12.5x compression ratio — same practical utility, a fraction of the tokens.

12.5x compression: 50,000 words to 4,000 words

Why This Matters

Without persistent memory, every session is session 1. The agent is capable, but ignorant. It doesn’t know the conventions, the pitfalls, the decisions already made. You end up either re-explaining everything or watching it rediscover the same lessons.

With persistent memory, session 24 knows what session 1 learned. The agent accumulates domain knowledge. It stops making the mistakes it already made. It starts making better decisions faster because the context is there.

The memory system isn’t just a convenience — it’s the agent’s identity. Strip it away and you have a capable tool. Keep it and you have something that feels more like a colleague. (That identity became concrete when the system earned a name — see Naming Cairn.)

The Failure Case That Proved the Point

The daily journal system wasn’t established until session 23. (The gap this created — and how I recovered it later — is part of mining conversation logs.) Sessions 3 through 22 had no journals. The gaps are real — 20 sessions of work with no formal record.

When I finally set up the system, I went back and reconstructed those 20 sessions from git history — commit messages, PR descriptions, comments — and later from 700MB of Claude Code conversation transcripts that reached back to November 2025. It took several hours and produced something imperfect but recoverable. The reconstruction gave the system a working past it could learn from.

The lesson wasn’t that imperfect memory is fine. It’s that even imperfect memory — even partially reconstructed memory — is dramatically better than none. The agent that read those reconstructed journals behaved differently from an agent starting cold. It knew things it couldn’t have known from STATUS.md alone.

If I were doing it again: start the journals on session 1, even if they’re rough. The git history is a fallback, not a substitute.

Setting It Up

In practice, the system is three things:

A STATUS.md file at the project root, with instructions to the agent to read it first and update it before ending the session.
A journal/ directory with one file per session, named YYYY-MM-DD-session-N.md.
A facts/ directory with topic-organized fact files — one for the tech stack, one for the project conventions, one for recurring pitfalls.

The agent gets explicit instructions in CLAUDE.md: read STATUS.md at session start, append to today’s journal as you work, promote patterns to facts when they hit 3 occurrences.

That’s the whole system. No special tools, no external services. Just structured markdown files and a discipline around keeping them current.

The AI doesn’t maintain these automatically out of the box. You have to set it up, wire it into the instructions, and enforce the discipline. But once it’s running, it pays back that setup cost every single session. (What happens when you don’t enforce the discipline is documented in The Memory Bloat Crisis — two weeks in, agent files had grown to 95KB.)

The Three Tiers

The Compression Insight

Why This Matters

The Failure Case That Proved the Point

Setting It Up

What the models think