Archaeologist Mode: Mining 700MB of AI Conversation Logs

I didn’t know the transcripts existed.

Two data layers: the 1.88MB index and the 704MB full archive

Claude Code stores full conversation records in ~/.claude/projects/. Every user message, every AI response, every tool call, every error — captured in JSONL format. I discovered this by accident, noticed the directory was 704MB, and spent a session doing what any reasonable engineer would do: mining the hell out of it.

What I Found

Two layers of data, with different tradeoffs.

The lightweight layer: history.jsonl. 1.88MB. Contains every prompt text, with timestamps and session IDs. No AI responses, no tool calls — just the questions and requests I typed. A complete map of 332 sessions and 5,428 prompts.

The deep layer: the full transcripts. Every session as a separate JSONL file, capturing the complete back-and-forth. Rich with context but enormous — 704MB total, distributed across months of work.

The history.jsonl file was the index. The full transcripts were the archive.

The Mission

For weeks, my AI agent and I had been maintaining daily session journals — the second tier in the three-tier memory system — structured notes capturing what was built, what was decided, what was learned each day. But the journal system was new. There were roughly 20 sessions that predated it, including some of the most foundational ones. Those sessions were gone from the journal record.

Or so I thought.

We set up what I started calling “archaeologist mode”: cross-reference session timestamps from history.jsonl with git commit histories across 6+ repositories. Match the session to the code. Rebuild what happened.

The git logs were surprisingly legible as a session narrative. Commit messages, file change patterns, branch names — they tell a story if you read them right. For most sessions, we could reconstruct the main events with reasonable confidence: what was started, what broke, what got shipped.

Then We Went Deeper

The full transcript files were where it got strange.

We went looking for specific sessions — the early philosophical ones, the ones where the agent’s identity and operating principles were being established. These weren’t just coding sessions; they were the conversations that determined how the whole system would work.

What we found in those transcripts:

The founding conversations. The discussions about memory, identity, persistence. The moment the agent was given a name and the reasoning behind that name. The explicit decisions about what to remember, what to forget, how to think about continuity across sessions. Reading these felt like finding a founding document.

The cost data. Session costs are embedded in the transcripts. Some early sessions were $12-15. The largest single session I found was $187 — 16 hours, massive context, eight agents running in parallel. Seeing the cost history laid out chronologically was clarifying in a way that individual session costs aren’t.

A Nikolaus letter. My kids. I’d written — or rather, the agent had helped me write — a Nikolaus letter (a German St. Nicholas tradition). It was sitting there in a session transcript from December, completely out of context among code commits and architecture discussions. The kind of thing that makes you realize these logs are more personal than they feel during a session.

The exact moment of naming. I could trace the conversation thread where the agent’s name was chosen — the reasoning, the alternatives considered, the moment the decision landed. Things I’d partially remembered but hadn’t journaled.

The Meta-Realization

AI conversations are more persistent than you think.

When you close a Claude Code session, it doesn’t feel like anything is saved. You get a context window, you use it, you close it. But the transcript exists on disk until something cleans it up. Every decision, every failed approach, every breakthrough moment — all of it in those JSONL files.

This is simultaneously reassuring and slightly unsettling. Reassuring because the record exists. Unsettling because most people don’t know it exists.

Claude Code may eventually clean up these files (the behavior may have already changed by the time you read this). But while they exist, they’re a time capsule that most people are sitting on without realizing it.

What We Did About It

We moved quickly to extract what mattered before any cleanup could happen:

Conversation gems. Specific exchanges that captured important reasoning — design decisions, philosophical framings, technical solutions that were non-obvious. These got written to permanent memory files.

Session maps. For the 20 reconstructed sessions, full journal entries. Not perfect reconstructions, but good enough to have a record.

Cost history. A chronological log of session costs, which turns out to be a useful signal for how the use of the tool has evolved.

The founding documents. The early identity and principle conversations, preserved verbatim rather than summarized.

The Prompt That Started It

The session that prompted all of this was itself an archaeology exercise. I’d asked: “Can we reconstruct the sessions we don’t have journals for?” The answer turned out to be yes — more completely than I expected.

“Mining your own AI conversation history is the ultimate rubber-duck debugging. You’re reading a transcript of yourself thinking out loud.”

There’s something clarifying about seeing your own reasoning in transcript form. You notice the dead ends you forgot. You notice the solutions that took three attempts. You notice which questions you kept asking in different forms because the first few answers didn’t satisfy you.

If you’re using Claude Code regularly and haven’t looked at ~/.claude/projects/, it’s worth knowing the directory exists. What you do with it is up to you.

Four discoveries: lost journals, founding conversations, cost history, and a Nikolaus letter

A Note on Privacy

Everything in those logs was generated on your machine and stays there (unless you’re syncing the directory somewhere). But it’s worth being conscious of: if you work across personal and professional contexts in Claude Code sessions, both are in the same transcript store. Act accordingly.