memory-bank
git clone https://github.com/Nagendhra-web/memory-bank
T=$(mktemp -d) && git clone --depth=1 https://github.com/Nagendhra-web/memory-bank "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/memory-bank" ~/.claude/skills/nagendhra-web-memory-bank-memory-bank && rm -rf "$T"
skills/memory-bank/SKILL.mdMemory Bank
An adaptive memory system that gives Claude Code persistent, intelligent context across sessions — while cutting token waste so your sessions last 3-5x longer. Not a flat file — a layered architecture that compresses, branches, diffs, self-heals, and loads only what matters.
Core Architecture
Memory Bank operates on three layers:
┌─────────────────────────────────────────────┐ │ Layer 2: GLOBAL MEMORY │ │ ~/.claude/GLOBAL-MEMORY.md │ │ Cross-project patterns, user preferences, │ │ reusable decisions. Permanent. │ ├─────────────────────────────────────────────┤ │ Layer 1: PROJECT MEMORY │ │ ./MEMORY.md (+ branch overlays) │ │ Architecture, decisions, active work. │ │ Lives as long as the project. │ ├─────────────────────────────────────────────┤ │ Layer 0: SESSION CONTEXT │ │ In-conversation only. │ │ Current task focus, scratch notes. │ │ Dies when session ends (persisted to L1). │ └─────────────────────────────────────────────┘
Layer 0 (Session) — Ephemeral. Tracks what you're doing right now. Automatically flushed to Layer 1 at session end.
Layer 1 (Project) — The primary memory file. Tracks project state, decisions, active work, blockers. Branch-aware: each git branch can have its own overlay that merges with the base memory.
Layer 2 (Global) — Cross-project knowledge. Your coding preferences, tool choices, patterns you always use. Lives in
~/.claude/GLOBAL-MEMORY.md.
Loaded alongside Layer 1 at session start.
See
for full architecture details.references/memory-layers.md
When to Activate
| Trigger | Action |
|---|---|
Session starts, exists | Full load sequence |
, | Mid-session update |
, , | Full session write |
, | Load + summarize |
, | Branch-aware load |
, | Health check |
, | Generate handoff doc |
, | Run compression |
| Recovery mode |
, | Session continuation protocol |
, | Context budget check |
, | Emergency save + continuation file |
Workflow
1. Session Start — The Load Sequence
Execute this sequence before doing anything else:
Step 1: Detect memory files └─ Check for MEMORY.md in project root └─ Check for ~/.claude/GLOBAL-MEMORY.md └─ Check for MEMORY-ARCHIVE.md (has history been archived?) Step 2: Detect git context └─ Current branch name └─ Check for .memory/branches/<branch>.md overlay └─ Days since last session (from "Last updated" field) Step 3: Session diff (if git available) └─ Commits since last memory update └─ Files changed since last session └─ Any conflicts between memory and current code state Step 4: Health check └─ Score memory freshness (see Health Scoring below) └─ Flag stale entries └─ Flag referenced files that no longer exist Step 5: Context-aware greeting └─ Summarize where we left off (2-3 sentences, specific) └─ Report any drift detected (code changed, memory stale) └─ State the next immediate action └─ Ask: "Ready to continue, or has the plan changed?"
Example greeting (fresh memory, same branch):
"Welcome back! Last session you finished the Stripe webhook handler in
and were about to write integration tests. Thesrc/api/webhooks/stripe.tsfunction is complete buthandlePaymentSuccess()is stubbed out. 3 commits have landed since — all yours, no surprises. Ready to pick up with the integration tests?"handleRefund()
Example greeting (stale memory, branch switched):
"Welcome back! Your memory is from 5 days ago on
, but you're now onmain. I found a branch overlay from 3 days ago with context about the profile avatar upload. However,feature/user-profilesreferenced in memory was renamed tosrc/components/Avatar.tsx. Want me to update memory with the current state before we continue?"ProfileImage.tsx
If no MEMORY.md exists:
- Proceed normally
- After first meaningful work, offer: "Want me to start tracking our progress? I'll create a memory file so next session picks up instantly."
2. Mid-Session Updates
When the user says "remember this" or you complete a significant milestone:
- Read current
MEMORY.md - Determine what changed:
- New decision made? → Update
Key Decisions - Task completed? → Move from
toActive Work
, updateCompletedWhere We Left Off - New blocker? → Add to
Blockers - Important context? → Add to
Notes
- New decision made? → Update
- Write the updated file
- Confirm with specifics: "Saved — added the Zod migration decision and marked the user model as complete."
Do NOT rewrite the entire file on mid-session updates. Only modify the sections that changed. This preserves context from session start.
3. Session End — The Write Sequence
When wrapping up, execute a full memory write:
Step 1: Audit the session └─ What was accomplished? (be specific: files, functions, lines) └─ What decisions were made and why? └─ What's blocked or unresolved? └─ What should happen next? (crystal clear next step) Step 2: Compress completed work └─ Move finished items to Completed with one-line summaries └─ Remove resolved blockers └─ Archive stale notes Step 3: Update memory health metadata └─ Update "Last updated" timestamp └─ Increment session counter └─ Update file reference table (verify paths still exist) Step 4: Write MEMORY.md └─ Full overwrite with current state └─ Verify the file was written successfully Step 5: Check compression threshold └─ If > 150 lines, suggest compression └─ If > 200 lines, auto-compress (see Smart Compression) Step 6: Prompt for global memory └─ Any cross-project learnings worth saving to Layer 2? └─ New user preferences discovered?
MEMORY.md Template
# Project Memory Last updated: [DATE] | Session [N] | Branch: [BRANCH] Memory health: [SCORE]/10 ## Project Overview [1-2 sentences. What this is, what stack, what stage.] ## Where We Left Off - **Current task:** [specific task with file/function reference] - **Status:** [done | in progress | blocked] - **Next immediate step:** [so clear Claude can start without asking anything] - **Open question:** [decision pending, if any] ## Completed - [DATE] [one-line summary with key files touched] - [DATE] [one-line summary] ## Active Work - [ ] [task — specific file, function, or component] - [ ] [task] - [x] [recently completed, will archive on next compression] ## Blockers - [blocker with context on what's needed to unblock] ## Key Decisions | Date | Decision | Reasoning | Affects | |------|----------|-----------|---------| | [DATE] | [what was decided] | [why] | [files/areas impacted] | ## Key Files | File | Purpose | Last Modified | |------|---------|---------------| | [path] | [what it does] | [session N] | ## Architecture Notes [Non-obvious design choices, data flow, system boundaries] ## Known Issues - [issue, severity, and workaround if any] ## Session Log | Session | Date | Summary | |---------|------|---------| | [N] | [DATE] | [one-line summary of what happened] | ## User Preferences [How the user likes to work — discovered across sessions] ## External Context [APIs, services, env setup — NO secrets, NO credentials, NEVER]
Branch-Aware Memory
When working across multiple git branches, memory adapts:
MEMORY.md <- Base project memory (main/trunk) .memory/ branches/ feature-auth.md <- Overlay for feature/auth branch feature-payments.md <- Overlay for feature/payments branch bugfix-race-condition.md <- Overlay for bugfix branch
How it works:
- At session start, detect current git branch
- Load base
firstMEMORY.md - Check
for an overlay.memory/branches/<branch-slug>.md - Merge overlay on top of base (overlay sections take priority)
- At session end, write changes back to the correct layer:
- Architecture decisions → base
(shared across branches)MEMORY.md - Branch-specific work →
.memory/branches/<branch>.md
- Architecture decisions → base
On branch merge:
- When a feature branch merges to main, prompt:
"The
branch just merged. Want me to fold its memory overlay into the base MEMORY.md and clean up the branch file?"feature/auth
See
for merge strategies.references/branch-aware-memory.md
Smart Compression
Memory files grow. Smart Compression keeps them useful:
Auto-compress triggers:
- MEMORY.md exceeds 150 lines → suggest compression
- MEMORY.md exceeds 200 lines → auto-compress
- Entries older than 5 sessions → candidates for archival
Compression rules:
- Completed tasks older than 3 sessions → collapse to one-liner in Session Log
- Resolved blockers → remove entirely
- Stale "Active Work" items (no progress in 3+ sessions) → flag for user
- Decision Log entries → NEVER compress (permanent record)
- Architecture Notes → NEVER compress (permanent record)
Archival: When session count exceeds 10, create
MEMORY-ARCHIVE.md:
# Memory Archive Archived sessions from Project Memory. ## Sessions 1-8 Summary [Paragraph summary of early project work] ## Key Milestones - Session 2: Initial project scaffolding complete - Session 5: Auth system shipped - Session 8: Database migration to Prisma complete
See
for the full compression algorithm.references/smart-compression.md
Session Diffing
At session start, detect what changed since memory was last written:
# Get the date from MEMORY.md "Last updated" field # Then check what happened since git log --oneline --since="[last-updated-date]" git diff --stat HEAD~[commits-since]
Report format:
"Since your last session (3 days ago), there have been 7 commits: 4 by you, 3 by @teammate. Key changes:
was refactored,src/api/users.tshas 2 new dependencies (zod, @tanstack/query). Your memory referencespackage.json— I'll verify it's still accurate."src/api/users.ts
Conflict detection: When session diff reveals changes that contradict memory:
- Memory says "using Express" but
now has Fastify → flagpackage.json - Memory references
but file was deleted → flagsrc/auth/login.ts - Memory says "blocked on API key" but
now has it → update.env
See
for conflict resolution strategies.references/session-diffing.md
Memory Health Scoring
Rate memory on a 1-10 scale across four dimensions:
| Dimension | Weight | Score 10 | Score 1 |
|---|---|---|---|
| Freshness | 30% | Updated today | > 14 days old |
| Relevance | 30% | All referenced files exist | Most files missing/renamed |
| Completeness | 20% | All sections filled, next step clear | Missing key sections |
| Actionability | 20% | Can start working immediately | Need to ask 3+ questions |
Display at session start:
Memory health: 8/10 Freshness: 9/10 (updated yesterday) Relevance: 7/10 (2 file paths changed) Completeness: 8/10 (all sections present) Actionability: 9/10 (next step is crystal clear)
If health < 5: Trigger recovery mode or suggest a memory rebuild.
Recovery Mode
When memory is severely stale, corrupted, or missing critical context:
Step 1: Scan the project └─ Read package.json / pyproject.toml / go.mod (detect stack) └─ Read README.md and CLAUDE.md (project context) └─ List key directories and recent files Step 2: Read git history └─ Last 20 commits (who, what, when) └─ Current branch and recent branches └─ Any open/recent PRs Step 3: Reconstruct memory └─ Build Project Overview from package.json + README └─ Build Key Files from most-modified files in git log └─ Build Key Decisions from commit messages and code patterns └─ Set "Where We Left Off" from most recent commits └─ Flag confidence level: "Reconstructed from code — verify with user" Step 4: Present and confirm └─ Show reconstructed memory to user └─ Ask for corrections └─ Write verified MEMORY.md
Handoff Protocol
Generate a developer handoff document that's optimized for humans (not Claude):
# Project Handoff: [Project Name] Generated: [DATE] | By: [user] via Claude Code ## Quick Start 1. Clone: `git clone [repo]` 2. Install: `[package manager] install` 3. Setup: [env vars, database, etc.] 4. Run: `[dev command]` ## Current State [Where the project is right now — what works, what doesn't] ## Architecture [System diagram, key components, data flow] ## Active Work [What's in progress, what's next, what's blocked] ## Key Decisions & Why [Decisions that a new developer would question — with the reasoning] ## Gotchas [Things that will bite you if you don't know about them] ## Who to Ask [People, channels, or docs for domain-specific questions]
Trigger with: "generate a handoff", "onboard someone to this project", "write a handoff doc"
Context Efficiency Engine
The #1 complaint with Claude Code: sessions hit context limits too fast. You spend half your tokens re-explaining context, and the other half doing actual work. Memory Bank flips this ratio.
The Token Problem (Verified with tiktoken)
Without memory-bank, every session start costs ~1,200 tokens:
Conversation overhead (4 exchanges): ~566 tokens User re-explains project, stack, status Claude asks clarifying questions User answers follow-ups Back-and-forth until Claude understands File reads (Claude reads 3+ files to orient): ~634 tokens Webhook handler: ~344 tokens Checkout route: ~257 tokens Stripe client: ~33 tokens TOTAL per session: ~1,200 tokens TOTAL over 10 sessions: ~12,000 tokens wasted
The Token Solution (Verified)
With memory-bank, the same session start costs ~400 tokens:
MEMORY.md loads (compact format): ~334 tokens Entire project context in one structured file Decisions, files, status, blockers, architecture Claude greeting + user confirms: ~60 tokens Claude already knows everything, no questions needed File reads needed: 0 tokens Memory has file purposes, no need to read source TOTAL per session: ~394 tokens (67% reduction) TOTAL over 10 sessions: ~3,940 tokens (saved ~8,060)
These numbers were measured using tiktoken on our example files. Actual savings depend on project complexity (larger projects save more).
Progressive Loading
Don't dump everything into context. Load in tiers:
Tier 1: ALWAYS load (costs ~200 tokens) └─ Project Overview (1-2 sentences) └─ Where We Left Off (current task, status, next step) └─ Active Blockers Tier 2: Load on DEMAND (costs ~300 tokens when needed) └─ Key Decisions (only when a decision comes up) └─ Key Files (only when working with files not in Tier 1) └─ Architecture Notes (only when touching architecture) Tier 3: Load ONLY when asked (costs ~200 tokens when needed) └─ Session Log (only for velocity/history questions) └─ User Preferences (only on first session or when relevant) └─ External Context (only when working with APIs/services)
Result: Instead of loading 800 tokens of memory at once, load 200 tokens immediately and the rest only when actually needed. Most sessions never need Tier 3 at all.
Compact Encoding Rules
Every line in MEMORY.md is optimized for maximum information per token:
Use structured shorthand, not prose:
BAD (38 tokens): "We made the decision to use Prisma as our ORM instead of Drizzle because it provides better TypeScript type inference and the team is already familiar with it from previous projects." GOOD (14 tokens): | 2025-04-01 | Prisma over Drizzle | Type inference, team familiarity | All DB |
Use tables for structured data (they compress well):
BAD (scattered prose — 120 tokens for 5 files): The main checkout route is in src/app/api/checkout/route.ts. The Stripe client is configured in src/lib/stripe.ts. Cart state management is in... GOOD (table — 60 tokens for 5 files): | File | Purpose | | src/app/api/checkout/route.ts | Stripe session creation | | src/lib/stripe.ts | Stripe client singleton | | src/stores/cart.ts | Zustand cart + persistence |
Use checklists for active work (scannable, dense):
BAD (prose): We are currently working on the webhook handler, which is partially complete. We also need to write tests and haven't started yet. GOOD (checklist): - [x] Stripe webhook handler — handlePaymentSuccess() - [ ] handleRefund() — stubbed, needs implementation - [ ] Integration tests for webhook endpoints
One line, one fact. No filler words:
BAD: "The project is essentially a web application that was built for..." GOOD: "Bakery e-commerce. Next.js 14, Prisma, Stripe. Launching April."
Context Budget Tracking
Monitor token usage and warn before hitting limits:
At session start, estimate the context budget: Available context: ~200,000 tokens (Claude's window) Memory load: ~800 tokens (Tier 1 + loaded Tiers) System prompt: ~2,000 tokens Remaining for work: ~197,200 tokens At 40% usage (~80,000 tokens consumed): → Suggest: "We're at 40% context. Consider compacting soon." At 60% usage (~120,000 tokens consumed): → Save a session checkpoint automatically → Suggest: "Context at 60%. Good time to /compact or start fresh." At 80% usage (~160,000 tokens consumed): → Auto-save full state to MEMORY.md → Alert: "Context is at 80%. Saving state now — you can continue in a new session with zero loss. Say 'wrap up' or keep going."
Session Continuation Protocol
When a session hits context limits or user wants to start fresh:
Step 1: EMERGENCY SAVE (before context dies) └─ Write MEMORY.md with EVERYTHING from current session └─ Include exact cursor position: file, function, line number └─ Include any uncommitted mental model (what Claude was thinking) └─ Include partial work state: what's done, what's half-done, what's next Step 2: Write CONTINUATION.md (a one-shot warm-up file) └─ Ultra-compact: under 50 lines, under 500 tokens └─ Contains ONLY what the next session needs to start immediately └─ Format: ```markdown # Continue: [task name] Resume from: `src/auth/refresh.ts:47` — writing rotateToken() ## State - handlePaymentSuccess(): DONE ✓ - handleRefund(): stubbed at line 89, needs Stripe refund.created event - Tests: NOT STARTED ## Context - Stripe webhook sig verified in middleware (line 12) - Using stripe.webhooks.constructEvent() not manual HMAC - Refund handler follows same pattern as payment handler ## Immediate Next Action Implement handleRefund() in src/api/webhooks/stripe/route.ts:89 using the stripe.refund.created event payload. Pattern: extract refund.payment_intent → find order → update status to "refunded"
Step 3: GREET AND GO (next session) └─ Read CONTINUATION.md first (it's the fast-path) └─ Read MEMORY.md for full context only if needed └─ Delete CONTINUATION.md after loading └─ Start working immediately — no questions, no warm-up
**Trigger phrases:** "save state", "I'm running out of context", "continue this later", "session is getting long" ### Token Savings By Feature (Verified) | Feature | How It Saves | Measured Impact | |---------|-------------|-----------------| | Structured memory vs re-explaining | Compact file replaces 4+ conversation exchanges | ~566 tokens/session | | Eliminating orientation file reads | Claude doesn't need to read source files to understand project | ~634 tokens/session | | Compact encoding (tables > prose) | Same info, 39-42% fewer tokens than prose | 39-42% reduction in memory size | | Session continuation protocol | CONTINUATION.md is under 200 tokens vs full re-warm-up | ~1,000 tokens on session handoff | | Smart compression | Keeps memory under 150 lines / ~700 tokens | Prevents bloat over time | **Verified totals (measured with tiktoken):** | Scenario | Tokens | Turns | |----------|--------|-------| | Without memory-bank | ~1,200/session | 8 turns | | With memory-bank | ~394/session | 2 turns | | **Savings** | **~806/session (67%)** | **6 turns** | | Over 10 sessions | **~8,060 saved** | 60 turns saved | | Over 30 sessions | **~24,180 saved** | 180 turns saved | ### Anti-Patterns That Waste Tokens **Never do these in memory files:**
✗ Verbose prose where a table works ✗ Repeating the same information in multiple sections ✗ Storing code snippets in memory (reference file:line instead) ✗ Long descriptions of completed work (one-line summaries only) ✗ Keeping resolved blockers (delete them) ✗ Storing information that's in README.md or CLAUDE.md already ✗ Using memory for things Git tracks (commit history, diffs, blame)
**Always do these:**
✓ Tables for structured data (decisions, files, tasks) ✓ Checklists for active work ✓ One sentence for Project Overview (not a paragraph) ✓ File:line references instead of describing code ✓ Delete resolved items (they're in git history) ✓ Reference other files instead of duplicating content
> See `references/context-efficiency.md` for the full token optimization guide. --- ## Rules for Excellent Memory **Be surgical, not vague.** Bad: "Working on auth" Good: "Implementing JWT refresh token rotation in `src/auth/refresh.ts` — `rotateToken()` is complete, needs Redis TTL logic in `src/cache/tokens.ts:47`" **The "Next immediate step" is the single most important line.** It should be so precise that Claude can start coding the instant a session begins, with zero clarifying questions. **Capture the "why" behind every decision.** Future Claude will encounter the same trade-offs and re-litigate them unless the reasoning is recorded. **Never store secrets.** No API keys, passwords, tokens, or credentials. Ever. Not even "temporarily". Reference `.env` or a secrets manager instead. **Overwrite on session end, surgical update mid-session.** Session end = full rewrite for consistency. Mid-session = targeted section updates to avoid losing context. **Keep it under 150 lines.** Compress aggressively. Stale information is actively harmful — it misleads more than it helps. --- ## Auto-Setup via CLAUDE.md For fully automatic memory with all features, add to project `CLAUDE.md` (or `~/.claude/CLAUDE.md` for all projects): ```markdown ## Memory At the start of every session: 1. Check for MEMORY.md in the project root 2. Check for ~/.claude/GLOBAL-MEMORY.md 3. Check current git branch and look for .memory/branches/<branch>.md 4. Run session diff — what changed since last memory update 5. Score memory health and flag any issues 6. Greet me with a specific summary and the next immediate step During sessions: - Update memory when I say "remember this" or complete a milestone - Track key decisions with reasoning in the decision table At session end (when I say "wrap up", "save", "done for now"): 1. Write comprehensive MEMORY.md with full current state 2. Ensure "Next immediate step" is crystal clear 3. Run compression if over 150 lines 4. Confirm what was saved
See
for the full integration guide.references/claude-md-integration.md
Reference Files
— Full architecture of the 3-tier memory system with promotion rules and cross-layer interactionsreferences/memory-layers.md
— Git branch integration, overlay merging, and cleanup strategiesreferences/branch-aware-memory.md
— Compression algorithm, archival thresholds, and what to never compressreferences/smart-compression.md
— Cross-session change detection, conflict resolution, and drift correctionreferences/session-diffing.md
— Team workflows, velocity tracking, handoff protocol, and enterprise patternsreferences/advanced-patterns.md
— Token optimization guide, progressive loading details, compact encoding referencereferences/context-efficiency.md
— Complete setup guide for automatic triggering across all projectsreferences/claude-md-integration.md
Examples
— Memory for a solo developer on a Next.js appexamples/solo-fullstack.md
— Team-shared memory for a backend serviceexamples/team-backend.md
— Multi-domain memory for a monorepoexamples/monorepo.md
— 5-line memory for quick prototypesexamples/minimal.md