Glossary

The vocabulary of agentic engineering — defined from practice, not Wikipedia. If a term on this site confuses you, it's probably here.

Core Terms

Agent Pipeline

A sequence of specialized agents that processes work end-to-end. One agent investigates, another implements, another tests, another opens the PR. Each hands off to the next. The goal: Jira ticket to merged PR without touching a file yourself.

The jump from single-agent to pipeline is where real autonomy begins. Design handoffs carefully — context loss between agents is a real cost.

Agentic Engineering

Building software where AI agents act autonomously, not just assist. Agents write code, run tests, open PRs, monitor CI — you orchestrate, they execute. The practice of designing, building, and maintaining these agent systems.

This is the discipline this site is about. Not 'using AI' in a general sense — specifically engineering autonomous agent systems.

AI Agent

An AI model equipped with tools and a task. It decides how to proceed, which tools to call, and when it's done. Unlike a chatbot, an agent takes actions — reads files, runs commands, calls APIs, reports results.

A model without tools is just a chatbot. The tools are what make it an agent.

AI Slop

Low-quality AI-generated content that is confident but empty. Technically correct, intellectually hollow. Consensus-based writing with no original thought, no friction, no specificity. The output you get when you don't push back on the model.

The antidote is voice. Real experience. Specific numbers. Opinions the model wouldn't generate by default.

Cairn

A persistent AI orchestrator built on top of Claude Code. Named after the stone piles that mark paths for travelers — each session adds a stone. Cairn accumulates knowledge across sessions through structured memory files, so work doesn't reset to zero every time.

The name was chosen in Session 1 (2026-02-09). It's not a product — it's a custom system built and documented on this blog.

CLAUDE.md

A markdown file in a project root that Claude automatically loads as project context. Like a README for AI — tells it about the codebase, conventions, patterns, and how to work. A core tool of context engineering.

Every project I work on seriously has one. It's the difference between 'help me write some code' and 'help me as a knowledgeable colleague.'

Context Engineering

The discipline of shaping what an AI model knows and can access. System prompts, CLAUDE.md files, tool definitions, memory systems — all of it. The quality of your context determines the quality of your agent's output.

Most developers plateau at L2 because they underestimate this. Prompt quality matters, but context architecture matters more.

Context Rot

The performance degradation that occurs as context length grows. Research-confirmed: LLM output quality drops as the context window fills up, even when all the information is relevant. More context is not always better.

This is why context hygiene matters. Trimming irrelevant tokens isn't just about cost — it's about quality.

Context Window

The total amount of text (measured in tokens) that an LLM can see and reason about at one time. Includes the system prompt, conversation history, file contents, tool responses — everything. When the window fills, older content gets dropped or quality degrades.

Knowing your context window is L3 awareness. Most L2 developers assume 'bigger is always better.' It isn't.

Hallucination

When an LLM confidently generates false or fabricated information. Not a bug — a fundamental property of models that predict plausible text. Agents can hallucinate file paths, API signatures, test results, and more.

The fix isn't blind trust — it's verification loops. Tests run, CI validates, humans review.

LLM (Large Language Model)

The underlying AI model (Claude, GPT-4, Llama, Gemini) that powers agents and coding assistants. Trained on massive text data to predict and generate text. The engine inside every AI tool.

Knowing which LLM you're using and how it differs from others becomes important at L3+. Not all LLMs are equally good at agentic tasks.

MCP (Model Context Protocol)

An open standard for connecting AI models to external tools and data sources. Enables agents to query databases, call APIs, read files, manage calendars — anything beyond generating text. Think of it as a plugin system for AI models.

Powerful, but adds token overhead. I killed my Atlassian MCP because it was burning 22K tokens per session on tools I never used.

Multi-agent

An architecture where multiple specialized agents work in parallel or sequence. Each handles a focused task; an orchestrator coordinates. Enables full pipeline automation: one agent writes code, another tests it, a third reviews it, a fourth creates the PR.

The jump from single-agent to multi-agent is where real autonomy begins. It's also where complexity grows fast — design carefully.

Orchestrator

The agent (or human) that coordinates other agents. Assigns tasks, monitors results, handles errors, decides what comes next. Operates at a higher level of abstraction — thinking about workflows, not individual file edits.

The orchestrator should rarely touch the actual work. Its job is coordination, not execution.

Prompt Injection

A security attack where malicious instructions are embedded in content an AI reads — web pages, documents, emails, function results. The goal: make the AI follow the attacker's instructions instead of the user's.

Real and underestimated. Any agent that reads external content is a potential target. The defense is treating all observed content as untrusted data.

Session

A single continuous interaction with Claude Code (or a similar tool), bounded by a context window. When the session ends, most state is gone unless explicitly saved. Sessions are the unit of work in agentic engineering.

Session hygiene — knowing what to save, what to discard, and when to start fresh — is more important than most developers realize.

Skill

A reusable instruction module that gives an agent domain expertise for a specific task. More focused than an agent — no persistent memory or identity, just targeted know-how. In Claude Code, skills are markdown files loaded as context when needed.

Skills replaced specialized agents in my setup. Less overhead, more composable. Documented in 'Skills Ate My Agents.'

Subagent

A specialized AI agent spawned by an orchestrator to handle a focused task. Gets its own context, its own tools, its own instructions. Reports back to the orchestrator when done. The building block of multi-agent pipelines.

Different from a general agent — a subagent has a narrow scope by design. Specialization is what makes it useful.

System Prompt

Instructions given to an AI model before the conversation starts. Defines its role, what it knows about the project, constraints, tools available, and how to behave. The primary lever of context engineering.

Most developers treat this as an afterthought. At L3+, it's where the real engineering happens.

Token

The atomic unit of text for LLMs. One token is roughly 4 characters or 0.75 words. Context windows (how much an AI can 'see' at once), API costs, and rate limits are all measured in tokens.

Token awareness separates L2 from L3. Bloated system prompts, over-verbose tools, and unnecessary context all burn tokens silently.

Also Worth Knowing

Auto-compact

Claude Code's automatic context compression that runs when the context window nears capacity. Summarizes the conversation to free up space and continue working. Can be configured or disabled.

Understand what gets lost in the compression. Important decisions made mid-session can disappear from the summary.

Distillation

The process of extracting durable patterns from raw observations. In agent systems: reading session journals and operational logs, identifying what recurs 3+ times, and writing it into stable knowledge files. Separate from observation — 'agents record, optimizer thinks.'

Grounding

Anchoring AI output to verifiable reality. Real test results, actual browser renders, production error logs — not theoretical performance. An agent claiming 97% OCR accuracy needs real receipts, real cameras, real kitchen lighting to validate it.

Hook

In Claude Code: automated scripts that run at specific points in the agent workflow — before or after a tool use, at session start or end. Used to enforce constraints, inject context, or trigger side effects without manual prompting.

Memory Bloat

Uncontrolled growth of agent memory files through accumulation without curation. Files that started at 1KB grow to 95KB of noise. Degrades context quality, increases token cost, and buries the signal. Solved by regular distillation cycles.

Optimizer Agent

A dedicated agent that reads operational logs from other agents, identifies recurring patterns, and updates their instructions with new learnings. Separates observation from curation. The 'thinks' half of 'agents record, optimizer thinks.'

RAG (Retrieval-Augmented Generation)

An architecture where an LLM retrieves relevant documents or embeddings at query time to augment its response. Requires a vector database and embedding pipeline. An alternative to flat memory files for agent knowledge systems.

I chose flat markdown memory files over RAG for simplicity — no vector database, no embedding pipeline. Sometimes the simple thing works.

Three-Tier Memory

An architecture for persistent agent knowledge across sessions: STATUS.md for current state (snapshot, overwritten each session), daily journals for long-term history (append-only), and facts files for distilled patterns (curated, durable).

Transient vs Structural Errors

A classification for agent failures. Transient errors are temporary — network timeout, rate limit, flaky test — worth retrying automatically. Structural errors are permanent — missing dependency, wrong API signature, logic bug — require human intervention, not a retry loop.

Conflating the two is how agents get stuck spinning on something they can't fix alone.

Worktree

A Git feature that creates an isolated working copy of a repository at a separate path. Used in agentic engineering to give parallel agents their own sandbox — changes in one worktree don't affect another.

The Developer Levels: L0–L∞

A framework for where you are on the agentic engineering journey. Not a hierarchy — a map. Most developers are somewhere between L1 and L2. The jump to L3 is the biggest mindset shift.

First Steps

ChatGPT, copy-paste

AI as a search engine replacement. You paste a question, get an answer, copy it into your IDE. No integration, no workflow change — just a faster way to find answers.

IDE Integration

GitHub Copilot, inline autocomplete

AI woven into your typing. Tab-complete suggestions, inline chat, code explanations without leaving the editor. Your productivity increases, but you're still driving everything.

Agentic Coding

Claude Code, Cursor

AI that acts across the codebase. Writes entire files, runs commands, edits multiple files in one task. You describe what you want; the AI does the work. Context still needs to be provided manually.

Context Engineering

CLAUDE.md, system prompts, hooks

You stop prompting and start engineering. CLAUDE.md files tell the AI about the project. System prompts shape its behavior. Hooks automate repetitive context setup. The AI becomes a knowledgeable colleague, not just a fast typist.

Orchestration

Multi-agent pipelines, specialized roles

You're not writing code — you're directing a team. Specialized agents for different roles (implementer, tester, reviewer, PR creator). An orchestrator coordinates. Jira ticket to merged PR can happen without you touching a file.

L∞

Full Autonomy

Memory systems, self-improving agents

The frontier. Agents with persistent memory across sessions, self-optimizing knowledge systems, agents that improve their own instructions based on experience. Most practitioners are still working toward L4.