Back to garden
concept evergreen ·

Codebase as memory

Codebase as Memory

The project state — git history, tests, file system — is a more reliable memory for execution work than conversation history or session caches. Implementation memory (cached observations, session summaries, “what we tried last time”) goes stale the moment someone pushes a commit. Durable memory (decisions, principles, patterns) is worth curating; implementation memory should be re-derived from source of truth every time.

Core formulations

  • Memory is liability — Cached observations encode assumptions that go stale. A fresh agent reading the code is working from truth; an agent reading session memory from three sessions ago is working from a stale snapshot of someone else’s understanding. (ralph-loop-agent, Profile)
  • The codebase is the database, CLAUDE.md is the schema — Persistent memory tools accumulate derived data and call it knowledge. It’s a cache pretending to be a database. (agent-memory-is-a-cache)
  • Beginner’s mind advantage — A fresh agent reads code as it is, not as it was. Same applies to humans: “I know how this works” bypasses verification.
  • Asterisk: comprehension — Codebase-as-truth only works if tests capture intent, not just behavior. The codebase remembers what was written, not what was understood. (comprehension-debt)

Distinction

Durable memoryImplementation memory
Decisions, principles, patternsSession history, cached observations, “where we were”
Slow-changing, human-reviewedFast-changing, agent-written
Lives in CLAUDE.md, principles, design docsLives in session caches, claude-mem, mem0
Worth curatingShould be re-derived from source of truth

Pattern instances

  • Ralph Loopwhile :; do cat PROMPT.md \| agent; done. Progress lives in commits, not conversation. Context rotation as feature, not bug. (ralph-loop-agent)
  • Karpathy LLM Wiki — Same architecture for knowledge work: raw sources + schema + LLM-curated artifact. Fresh agent re-derives state. (karpathy-llm-wiki)
  • Shin journaling — Vault is the durable layer. Session has no memory beyond what gets captured back to disk.

Failure modes

  • claude-mem stale-poisoning — Parallel agent told work wasn’t done; observation was stale. Crystallized “memory is liability” (2026-03-14, Profile changelog).
  • Comprehension debt — Tests verify behavior you considered, not behaviors you didn’t. Codebase remembers code, not understanding. (comprehension-debt)
  • Mistaking transcripts for knowledge — Session logs aren’t the lesson. The lesson is what survives editing into a durable artifact.
  • Belief: “Codebase as memory / memory is liability” (Profile)
  • Bet: verification-cost-grows — verification cost grows as generation cost falls; durable artifacts shift the cost balance.
  • Principle: “Durable artifacts over session memory” (PRINCIPLES)
  • Tool eval: Tool-Eval-Mem0, Tool-Eval-txtai — failure cases that motivated the belief.

Open questions

  • Where’s the boundary? Voice/preference data (Profile, principles) is durable. Code observations are not. What about user observations (e.g. learned preferences across sessions)?
  • Can durable memory be auto-promoted from implementation memory, or does promotion require human review?

Recent edits

  • vault: update site