Pro-workflow context-engineering
Master the four operations of context engineering — Write, Select, Compress, Isolate. Manage token budgets, compaction strategies, and context partitioning to keep AI sessions sharp and efficient.
git clone https://github.com/rohitg00/pro-workflow
T=$(mktemp -d) && git clone --depth=1 https://github.com/rohitg00/pro-workflow "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/context-engineering" ~/.claude/skills/rohitg00-pro-workflow-context-engineering && rm -rf "$T"
skills/context-engineering/SKILL.mdContext Engineering
Four operations control everything about how context flows through an AI coding session. Master them and you control the quality of every response.
The Four Operations
1. Write — Persist Info Outside Context
Move information out of the context window into durable storage so it survives compaction and session boundaries.
Where to write:
| Target | When | Example |
|---|---|---|
| CLAUDE.md | Permanent project rules | "Always use pnpm, never npm" |
| NOTES.md / scratchpad | Working state for current task | Architecture decisions, open questions |
| Learnings and patterns | rules from corrections |
| External files | Data too large for context | Test plans, migration checklists |
Pattern — Scratchpad workflow:
1. Start complex task → create NOTES.md with goals and constraints 2. After research → write findings to NOTES.md 3. After compaction → NOTES.md survives, context does not 4. Resume → read NOTES.md to recover full state
2. Select — Retrieve Relevant Info
Pull the right information into context at the right time. Precision matters more than volume.
Methods ranked by precision:
references — exact file injection@file
/grep
— targeted pattern searchGlob- Subagent exploration — delegated deep search
- RAG / embeddings — semantic retrieval for large codebases
Key principle: Focused 300 tokens > unfocused 113K tokens.
A surgical grep result that returns the exact function signature beats dumping an entire module into context. Every irrelevant token dilutes attention.
Pattern — Progressive retrieval:
1. Start with file names (Glob) 2. Narrow to specific functions (Grep) 3. Read only the relevant lines (Read with offset+limit) 4. Never read entire large files when you need one function
3. Compress — Reduce Tokens, Preserve Signal
Shrink context without losing the information that matters.
Compaction strategies:
| Strategy | How | When |
|---|---|---|
with focus | | Task boundaries |
| Microcompact | Ask Claude to summarize tool output inline | After large reads/searches |
| Head+tail | Read first 20 + last 20 lines of large output | Log analysis, test results |
| Tool result clearing | Subagent results auto-clear after reporting | Heavy exploration |
| Semantic selection | Summarize findings, discard raw data | Research phases |
Compaction triggers:
- After planning, before implementation
- After completing a feature or milestone
- When context exceeds 50% (set
)CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50 - Before switching task domains
- After heavy search/read operations
PostCompact hook — Re-inject critical context:
{ "type": "PostCompact", "command": "cat .claude/critical-context.md" }
Use this to ensure project rules, current task state, or architecture constraints survive every compaction.
4. Isolate — Partition Across Execution Spaces
Don't load everything into one context. Split work across independent execution spaces.
| Method | Isolation Level | Use When |
|---|---|---|
| Subagents | Forked context | Heavy exploration, test runs, doc generation |
Worktrees () | Full repo copy | Parallel features, competing approaches |
(built-in Claude Code) | Temporary overlay | Quick questions without entering conversation history |
| Agent teams | Independent sessions | Cross-layer changes, parallel reviews |
Fresh session () | Clean slate | Unrelated work, degraded context |
Pattern — Subagent delegation:
Main session: planning, coordination, commits Subagent 1: explore auth module, report findings Subagent 2: run test suite, report failures Subagent 3: generate migration script
Main context stays clean. Subagents handle the volume.
Context Budget Planning
Example baseline (calibrate with
/context): ~200K total window, ~20K overhead (CLAUDE.md, tool definitions, MCP schemas). Plan around ~180K usable — actual budgets vary by model and configuration.
| Allocation | Budget | What Goes Here |
|---|---|---|
| Static context | 20-30K | CLAUDE.md, tool schemas, MCP definitions |
| Dynamic context | 150-180K | Code, conversation, tool results |
Put static context first. CLAUDE.md and tool definitions load before conversation. Keeping them stable maximizes prompt cache hits — saves cost and latency.
| Phase | Target Usage | Action If Over |
|---|---|---|
| Planning | < 20% | Keep plans concise, write to scratchpad |
| Implementation | < 50% | Compact between files, delegate reads |
| Testing | < 70% | Delegate test runs to subagents |
| Review | < 85% | Start fresh session if degraded |
When to /clear vs /compact vs Subagent
| Situation | Action |
|---|---|
| Task boundary, want to keep learnings | with focus |
| Context degraded, Claude repeating itself | , then if still bad |
| Starting unrelated work | or new session |
| Heavy read/search operation | Delegate to subagent |
| Quick side question | (doesn't pollute main context) |
| Exploring multiple approaches | Worktrees or agent teams |
Anti-Patterns
- Loading entire files when you need one function
- Keeping MCP tool results in context after extracting what you need
- Running 15+ MCPs (each adds tool schema overhead to every request)
- Vague prompts that force Claude to search broadly ("fix the code")
- Never compacting until auto-compact triggers at 95%
Add to CLAUDE.md
## Context Engineering Write to NOTES.md for working state that must survive compaction. Select with precision — grep first, read specific lines, never dump whole files. Compact at 50% or task boundaries. Set CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=50. Isolate heavy work to subagents. Main session stays for coordination and commits.