Claude-skill-registry-data moai-alfred-context-budget
Enterprise Claude Code context window optimization with 2025 best practices: aggressive clearing, memory file management, MCP optimization, strategic chunking, and quality-over-quantity principles for 200K token context windows.
git clone https://github.com/majiayu000/claude-skill-registry-data
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/moai-alfred-context-budget" ~/.claude/skills/majiayu000-claude-skill-registry-data-moai-alfred-context-budget && rm -rf "$T"
data/moai-alfred-context-budget/SKILL.mdmoai-alfred-context-budget
Enterprise Context Window Optimization for Claude Code
Overview
Enterprise-grade context window management for Claude Code covering 200K token optimization, aggressive clearing strategies, memory file management, MCP server optimization, and 2025 best practices for maintaining high-quality AI-assisted development sessions.
Core Capabilities:
- ✅ Context budget allocation (200K tokens)
- ✅ Aggressive context clearing patterns
- ✅ Memory file optimization (<500 lines each)
- ✅ MCP server efficiency monitoring
- ✅ Strategic chunking for long-running tasks
- ✅ Quality-over-quantity principles
Quick Reference
When to Use This Skill
Automatic Activation:
- Context window approaching 80% usage
- Performance degradation detected
- Session handoff preparation
- Large project context management
Manual Invocation:
Skill("moai-alfred-context-budget")
Key Principles (2025)
- Avoid Last 20% - Performance degrades in final fifth of context
- Aggressive Clearing -
every 1-3 messages for quality/clear - Lean Memory Files - Keep each file < 500 lines
- Disable Unused MCPs - Each server adds tool definitions
- Quality > Quantity - 10% with relevant info beats 90% with noise
Pattern 1: Context Budget Allocation
Overview
Claude Code provides 200K token context window with strategic allocation across system, tools, history, and working context.
Context Budget Breakdown
# Claude Code Context Budget (200K tokens) Total Context Window: 200,000 tokens Allocation: System Prompt: 2,000 tokens (1%) - Core instructions - CLAUDE.md project guidelines - Agent directives Tool Definitions: 5,000 tokens (2.5%) - Read, Write, Edit, Bash, etc. - MCP server tools (Context7, Playwright, etc.) - Skill() invocation metadata Session History: 30,000 tokens (15%) - Previous messages - Tool call results - User interactions Project Context: 40,000 tokens (20%) - Memory files (.moai/memory/) - Key source files - Documentation snippets Available for Response: 123,000 tokens (61.5%) - Current task processing - Code generation - Analysis output
Monitoring Context Usage
# Check current context usage /context # Example output interpretation: # Context Usage: 156,234 / 200,000 tokens (78%) # ⚠️ WARNING: Approaching 80% threshold # Action: Consider /clear or archive old discussions
Context Budget Anti-Patterns
# BAD: Unoptimized Context Session History: 80,000 tokens (40%) # Too much history - 50 messages of exploratory debugging - Stale error logs from 2 hours ago - Repeated "try this" iterations Project Context: 90,000 tokens (45%) # Too much loaded - Entire src/ directory (unnecessary) - node_modules types (never needed) - 10 documentation files (only need 2) Available for Response: 23,000 tokens (11.5%) # TOO LOW! - Can't generate quality code - Forced to truncate responses - Poor reasoning quality # GOOD: Optimized Context Session History: 15,000 tokens (7.5%) # Cleared regularly - Only last 5-7 relevant messages - Current task discussion - Key decisions documented Project Context: 25,000 tokens (12.5%) # Targeted loading - 3-4 files for current task - CLAUDE.md (always) - Specific memory files (on-demand) Available for Response: 155,000 tokens (77.5%) # OPTIMAL! - High-quality code generation - Deep reasoning capacity - Complex refactoring support
Pattern 2: Aggressive Context Clearing
Overview
The
/clear command should become muscle memory, executed every 1-3 messages to maintain output quality.
When to Clear Context
// Decision Tree for /clear Usage interface ContextClearingStrategy { trigger: string; frequency: string; action: string; } const clearingStrategies: ContextClearingStrategy[] = [ { trigger: "Task completed", frequency: "Every task", action: "/clear immediately after success" }, { trigger: "Context > 80%", frequency: "Automatic", action: "/clear + document key decisions in memory file" }, { trigger: "Debugging session", frequency: "Every 3 attempts", action: "/clear stale error logs, keep only current" }, { trigger: "Switching tasks", frequency: "Every switch", action: "/clear + update session-summary.md" }, { trigger: "Poor output quality", frequency: "Immediate", action: "/clear + re-state requirements concisely" } ];
Clearing Workflow Pattern
#!/bin/bash # Example: Task completion workflow with clearing # Step 1: Complete current task implement_feature() { echo "Implementing authentication..." # ... work done ... echo "✓ Authentication implemented" } # Step 2: Document key decisions BEFORE clearing document_decision() { cat >> .moai/memory/auth-decisions.md <<EOF ## Authentication Implementation ($(date +%Y-%m-%d)) **Approach**: JWT with httpOnly cookies **Rationale**: Prevents XSS attacks, CSRF protection via SameSite **Key Files**: src/auth/jwt.ts, src/middleware/auth-check.ts EOF } # Step 3: Clear context # /clear # Step 4: Start fresh with next task start_next_task() { echo "Context cleared. Starting API rate limiting..." } # Workflow implement_feature document_decision # User manually executes: /clear # start_next_task
What to Clear vs Keep
# ✅ ALWAYS CLEAR (After Documenting) Clear: - Exploratory debugging sessions - "Try this" iteration history - Stale error logs - Completed task discussions - Old file diffs - Abandoned approaches # 📝 DOCUMENT BEFORE CLEARING Document First: - Key architectural decisions - Non-obvious implementation choices - Failed approaches (why they failed) - Performance insights - Security considerations # ❌ NEVER CLEAR (Part of Project Context) Keep: - CLAUDE.md (project guidelines) - Active memory files - Current task requirements - Ongoing conversation - Recent (< 3 messages) exchanges
Pattern 3: Memory File Management
Overview
Memory files are read at session start, consuming context tokens. Keep them lean and focused.
Memory File Structure (Best Practices)
.moai/memory/ ├── session-summary.md # < 300 lines (current session state) ├── architectural-decisions.md # < 400 lines (ADRs) ├── api-contracts.md # < 200 lines (interface specs) ├── known-issues.md # < 150 lines (blockers, workarounds) └── team-conventions.md # < 200 lines (code style, patterns) Total Memory Budget: < 1,250 lines (~25K tokens)
Memory File Template
<!-- .moai/memory/session-summary.md --> # Session Summary **Last Updated**: 2025-01-12 14:30 **Current Sprint**: Feature/Auth-Refactor **Active Tasks**: 2 in progress, 3 pending ## Current State ### ✅ Completed This Session 1. JWT authentication implementation (commit: abc123) 2. Password hashing with bcrypt (commit: def456) ### 🔄 In Progress 1. OAuth2 integration (70% complete) - Provider setup done - Callback handler in progress - Files: src/auth/oauth.ts ### 📋 Pending 1. Rate limiting middleware 2. Session management 3. CSRF protection ## Key Decisions **Auth Strategy**: JWT in httpOnly cookies (XSS prevention) **Password Min Length**: 12 chars (OWASP 2025 recommendation) ## Blockers None currently. ## Next Actions 1. Complete OAuth callback handler 2. Add tests for OAuth flow 3. Document OAuth setup in README
Memory File Anti-Patterns
<!-- ✅ GOOD: Lean Memory File (180 lines) --><!-- ❌ BAD: Bloated Memory File (1,200 lines) --> # Session Summary ## Completed Tasks (Last 3 Weeks) <!-- 800 lines of old task history --> <!-- This is what git commit history is for! --> ## All Code Snippets Ever Written ```javascript // 400 lines of full code snippets // Should be in git, not memory files
Session Summary
Last Updated: 2025-01-12 14:30
Active Work (This Session)
- OAuth integration: 70% (src/auth/oauth.ts)
- Blocker: None
Key Decisions (Last 7 Days)
- Auth: JWT in httpOnly cookies (XSS prevention)
- Hashing: bcrypt, cost factor 12
Next Actions
- Complete OAuth callback
- Add OAuth tests
- Update README
### Memory File Rotation Strategy ```bash #!/bin/bash # Rotate memory files when they exceed limits rotate_memory_file() { local file="$1" local max_lines=500 local current_lines=$(wc -l < "$file") if [[ $current_lines -gt $max_lines ]]; then echo "Rotating $file ($current_lines lines > $max_lines limit)" # Archive old content local timestamp=$(date +%Y%m%d) local archive_dir=".moai/memory/archive" mkdir -p "$archive_dir" # Keep only recent content (last 300 lines) tail -n 300 "$file" > "${file}.tmp" # Archive full file mv "$file" "${archive_dir}/$(basename "$file" .md)-${timestamp}.md" # Replace with trimmed version mv "${file}.tmp" "$file" echo "✓ Archived to ${archive_dir}/" fi } # Check all memory files for file in .moai/memory/*.md; do rotate_memory_file "$file" done
Pattern 4: MCP Server Optimization
Overview
Each enabled MCP server adds tool definitions to system prompt, consuming context tokens. Disable unused servers.
MCP Context Impact
// .claude/mcp.json - Context-optimized configuration { "mcpServers": { // ✅ ENABLED: Active development tools "context7": { "command": "npx", "args": ["-y", "@context7/mcp"], "env": { "CONTEXT7_API_KEY": "your-key" } }, // ❌ DISABLED: Not needed for current project // "playwright": { // "command": "npx", // "args": ["-y", "@playwright/mcp"] // }, // ✅ ENABLED: Documentation research "sequential-thinking": { "command": "npx", "args": ["-y", "@sequential-thinking/mcp"] } // ❌ DISABLED: Slackbot not in use // "slack": { // "command": "npx", // "args": ["-y", "@slack/mcp"] // } } }
Measuring MCP Overhead
# Monitor MCP context usage /context # Example output: # MCP Servers (3 enabled): # - context7: 847 tokens (tool definitions) # - sequential-thinking: 412 tokens # - playwright: 1,234 tokens (DISABLED to save tokens) # # Total MCP Overhead: 1,259 tokens
MCP Usage Strategy
class MCPUsageStrategy: """Strategic MCP server management for context optimization""" STRATEGIES = { "documentation_heavy": { "enable": ["context7"], "disable": ["playwright", "slack", "github"], "rationale": "Research phase, need API docs access" }, "testing_phase": { "enable": ["playwright", "sequential-thinking"], "disable": ["context7", "slack"], "rationale": "E2E testing, browser automation needed" }, "code_review": { "enable": ["github", "sequential-thinking"], "disable": ["context7", "playwright", "slack"], "rationale": "PR review, need GitHub API access" }, "minimal": { "enable": [], "disable": ["*"], "rationale": "Maximum context availability, no external tools" } } @staticmethod def optimize_for_phase(phase: str): """ Reconfigure .claude/mcp.json for current development phase """ strategy = MCPUsageStrategy.STRATEGIES.get(phase, "minimal") print(f"Optimizing MCP servers for: {phase}") print(f"Enable: {strategy['enable']}") print(f"Disable: {strategy['disable']}") print(f"Rationale: {strategy['rationale']}") # Update .claude/mcp.json accordingly
Pattern 5: Strategic Chunking
Overview
Break large tasks into smaller pieces completable within optimal context bounds (< 80% usage).
Task Chunking Strategy
// Task size estimation and chunking interface Task { name: string; estimatedTokens: number; dependencies: string[]; } const chunkTask = (largeTask: Task): Task[] => { const MAX_CHUNK_TOKENS = 120_000; // 60% of 200K context if (largeTask.estimatedTokens <= MAX_CHUNK_TOKENS) { return [largeTask]; // No chunking needed } // Example: Authentication system (estimated 250K tokens) const chunks: Task[] = [ { name: "Chunk 1: User model & password hashing", estimatedTokens: 80_000, dependencies: [] }, { name: "Chunk 2: JWT generation & validation", estimatedTokens: 70_000, dependencies: ["Chunk 1"] }, { name: "Chunk 3: Login/logout endpoints", estimatedTokens: 60_000, dependencies: ["Chunk 2"] }, { name: "Chunk 4: Session middleware & guards", estimatedTokens: 40_000, dependencies: ["Chunk 3"] } ]; return chunks; }; // Workflow: // 1. Complete Chunk 1 // 2. /clear // 3. Document Chunk 1 results in memory file // 4. Start Chunk 2 with minimal context
Chunking Anti-Patterns
# ❌ BAD: Mixing Unrelated Tasks Chunk 1 (200K tokens - OVERLOADED): - User authentication - Payment processing - Email notifications - Admin dashboard - Analytics integration # Result: Poor quality on ALL tasks, context overflow # ✅ GOOD: Focused Chunks Chunk 1 (60K tokens): - User authentication only - Complete, test, document Chunk 2 (70K tokens): - Payment processing only - Builds on auth from Chunk 1 Chunk 3 (50K tokens): - Email notifications - Uses auth + payment data
Pattern 6: Quality Over Quantity Context
Overview
10% context with highly relevant information produces better results than 90% filled with noise.
Context Quality Checklist
## Before Adding to Context Ask yourself: 1. **Relevance**: Does this directly support current task? - ✅ YES: Load file - ❌ NO: Skip or summarize 2. **Freshness**: Is this information current? - ✅ Current: Keep in context - ❌ Stale (>1 hour): Archive or delete 3. **Actionability**: Will Claude use this to generate code? - ✅ Actionable: Include - ❌ FYI only: Document in memory file, remove from context 4. **Uniqueness**: Is this duplicated elsewhere? - ✅ Unique: Keep - ❌ Duplicate: Remove duplicates, keep one canonical source ## High-Quality Context Example (30K tokens, 15%) Context Contents: 1. CLAUDE.md (2K tokens) - Always loaded 2. src/auth/jwt.ts (5K tokens) - Current file being edited 3. src/types/auth.ts (3K tokens) - Type definitions needed 4. .moai/memory/session-summary.md (4K tokens) - Current session state 5. tests/auth.test.ts (8K tokens) - Test file for reference 6. Last 5 messages (8K tokens) - Recent discussion Total: 30K tokens Quality: HIGH - Every token is directly relevant to current task ## Low-Quality Context Example (170K tokens, 85%) Context Contents: 1. CLAUDE.md (2K tokens) 2. Entire src/ directory (80K tokens) - ❌ 90% irrelevant 3. node_modules/ types (40K tokens) - ❌ Never needed 4. 50 previous messages (30K tokens) - ❌ Stale debugging sessions 5. 10 documentation files (18K tokens) - ❌ Only need 1-2 Total: 170K tokens Quality: LOW - <10% of tokens are actually useful Result: Poor code generation, missed context, truncated responses
Best Practices Checklist
Context Allocation:
- Context usage maintained below 80%
- System prompt < 2K tokens
- MCP tools < 5K tokens total
- Session history < 30K tokens
- Project context < 40K tokens
- Available response capacity > 100K tokens
Aggressive Clearing:
-
executed every 1-3 messages/clear - Context cleared after each task completion
- Key decisions documented before clearing
- Stale error logs removed immediately
- Exploratory sessions cleared regularly
Memory File Management:
- Each memory file < 500 lines
- Total memory files < 1,250 lines
- session-summary.md updated before task switches
- Old content archived to .moai/memory/archive/
- No raw code stored in memory (summarize instead)
MCP Optimization:
- Unused MCP servers disabled
-
checked regularly/context - MCP overhead < 5K tokens
- Servers enabled/disabled per development phase
Strategic Chunking:
- Large tasks split into < 120K token chunks
- Related work grouped in same chunk
- Chunk dependencies documented
-
between chunks/clear - Previous chunk results in memory file
Quality Over Quantity:
- Only load files needed for current task
- Remove stale information (>1 hour old)
- Eliminate duplicate context
- Summarize instead of including full files
- Verify every loaded item is actionable
Common Pitfalls to Avoid
Pitfall 1: Loading Entire Codebase
# ❌ BAD # User: "Help me understand this project" # Claude loads all 200 files in src/ # ✅ GOOD # User: "Help me understand the authentication flow" # Claude loads only: # - src/auth/jwt.ts # - src/middleware/auth-check.ts # - tests/auth.test.ts
Pitfall 2: Never Clearing Context
# ❌ BAD: 3-Hour Session Without Clearing Context: 195K / 200K tokens (97.5%) - 80 messages of trial-and-error debugging - 15 failed approaches still in context - Stale error logs from 2 hours ago Result: "I need to truncate my response..." # ✅ GOOD: Clearing Every 5-10 Minutes Context: 45K / 200K tokens (22.5%) - Only last 5 relevant messages - Current task files - Fresh, high-quality context Result: Complete, high-quality responses
Pitfall 3: Bloated Memory Files
<!-- ❌ BAD: 2,000-line session-summary.md --> - Takes 40K tokens just to load - 90% is outdated information - Prevents loading actual source files <!-- ✅ GOOD: 250-line session-summary.md --> - Takes 5K tokens to load - 100% current and relevant - Leaves room for source files
Tool Versions (2025)
| Tool | Version | Purpose |
|---|---|---|
| Claude Code | 1.5.0+ | CLI interface |
| Claude Sonnet | 4.5+ | Model (200K context) |
| Context7 MCP | Latest | Documentation research |
| Sequential Thinking MCP | Latest | Problem solving |
References
- Claude Code Context Management - Official documentation
- Claude Code Best Practices - Community guide
- Context Window Optimization - 2025 deep dive
- Memory Management Strategies - Advanced patterns
Changelog
- v4.0.0 (2025-01-12): Enterprise upgrade with 2025 best practices, aggressive clearing patterns, MCP optimization, strategic chunking
- v1.0.0 (2025-03-29): Initial release
Works Well With
- Development best practicesmoai-alfred-practices
- Session managementmoai-alfred-session-state
- Memory file patternsmoai-cc-memory
- 4-step workflow optimizationmoai-alfred-workflow