concept evergreen ·
Codebase as memory
Codebase as Memory
The project state — git history, tests, file system — is a more reliable memory for execution work than conversation history or session caches. Implementation memory (cached observations, session summaries, “what we tried last time”) goes stale the moment someone pushes a commit. Durable memory (decisions, principles, patterns) is worth curating; implementation memory should be re-derived from source of truth every time.
Core formulations
- Memory is liability — Cached observations encode assumptions that go stale. A fresh agent reading the code is working from truth; an agent reading session memory from three sessions ago is working from a stale snapshot of someone else’s understanding. (ralph-loop-agent, Profile)
- The codebase is the database, CLAUDE.md is the schema — Persistent memory tools accumulate derived data and call it knowledge. It’s a cache pretending to be a database. (agent-memory-is-a-cache)
- Beginner’s mind advantage — A fresh agent reads code as it is, not as it was. Same applies to humans: “I know how this works” bypasses verification.
- Asterisk: comprehension — Codebase-as-truth only works if tests capture intent, not just behavior. The codebase remembers what was written, not what was understood. (comprehension-debt)
Distinction
| Durable memory | Implementation memory |
|---|---|
| Decisions, principles, patterns | Session history, cached observations, “where we were” |
| Slow-changing, human-reviewed | Fast-changing, agent-written |
Lives in CLAUDE.md, principles, design docs | Lives in session caches, claude-mem, mem0 |
| Worth curating | Should be re-derived from source of truth |
Pattern instances
- Ralph Loop —
while :; do cat PROMPT.md \| agent; done. Progress lives in commits, not conversation. Context rotation as feature, not bug. (ralph-loop-agent) - Karpathy LLM Wiki — Same architecture for knowledge work: raw sources + schema + LLM-curated artifact. Fresh agent re-derives state. (karpathy-llm-wiki)
- Shin journaling — Vault is the durable layer. Session has no memory beyond what gets captured back to disk.
Failure modes
- claude-mem stale-poisoning — Parallel agent told work wasn’t done; observation was stale. Crystallized “memory is liability” (2026-03-14, Profile changelog).
- Comprehension debt — Tests verify behavior you considered, not behaviors you didn’t. Codebase remembers code, not understanding. (comprehension-debt)
- Mistaking transcripts for knowledge — Session logs aren’t the lesson. The lesson is what survives editing into a durable artifact.
Related
- Belief: “Codebase as memory / memory is liability” (Profile)
- Bet: verification-cost-grows — verification cost grows as generation cost falls; durable artifacts shift the cost balance.
- Principle: “Durable artifacts over session memory” (PRINCIPLES)
- Tool eval: Tool-Eval-Mem0, Tool-Eval-txtai — failure cases that motivated the belief.
Open questions
- Where’s the boundary? Voice/preference data (Profile, principles) is durable. Code observations are not. What about user observations (e.g. learned preferences across sessions)?
- Can durable memory be auto-promoted from implementation memory, or does promotion require human review?
Recent edits
- vault: update site