Claude-skill-registry context-optimizer
Format and structure data files for optimal Claude reasoning and extraction. Use when: (1) Preparing documents/data for Claude analysis, (2) User asks to 'optimize context' or 'format for Claude', (3) Structuring multi-document corpora, (4) Improving LLM retrieval/reasoning over files, (5) Converting data between formats for Claude consumption, (6) User mentions 'lost in the middle' or context window issues. Supports XML, JSON, YAML, CSV, and markdown.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/context-optimizer" ~/.claude/skills/majiayu000-claude-skill-registry-context-optimizer-be60d8 && rm -rf "$T"
skills/data/context-optimizer/SKILL.mdContext Optimizer
Optimize data file structure and formatting for Claude's reasoning and information extraction.
Core Principle
Claude's attention follows a recency gradient: strongest at context boundaries (start/end), weakest in the middle. Optimization strategy:
- Position - Where content appears matters more than format choice
- Structure - Explicit delimiters enable selective retrieval
- Metadata - Source attribution enables grounding in evidence
Quick Reference
| Content Type | Optimal Position | Recommended Format |
|---|---|---|
| Reference documents | TOP of context | XML with metadata |
| Instructions | After documents | Markdown/plain text |
| Examples | Between docs & instructions | XML with |
| Query/task | END of context | Plain text |
| State/status data | Any | JSON |
| Progress notes | Any | Unstructured text |
| Tabular data | TOP | CSV or markdown tables |
Workflow
Step 1: Classify Content
Determine content type and select format:
Structured data (schemas, records, API responses) → JSON Multi-document corpus (reports, research, contracts) → XML with metadata Tabular data (metrics, comparisons) → CSV or markdown tables Configuration (settings, parameters) → YAML Narrative/progress → Unstructured markdown
Step 2: Apply Positioning Rule
┌─────────────────────────────────────┐ │ DOCUMENTS/DATA (position: TOP) │ ← Strongest attention ├─────────────────────────────────────┤ │ EXAMPLES (if applicable) │ ├─────────────────────────────────────┤ │ INSTRUCTIONS │ ├─────────────────────────────────────┤ │ QUERY/TASK (position: END) │ ← Strongest attention └─────────────────────────────────────┘
Placing queries at the end improves response quality by up to 30%.
Step 3: Structure Documents
For multi-document contexts, use the standard wrapper pattern:
<documents> <document index="1" priority="primary"> <source>filename.pdf</source> <type>financial_report</type> <content> {{DOCUMENT_CONTENT}} </content> </document> </documents>
Always include:
- For unambiguous referenceindex
- For citation/attributionsource
- For semantic filteringtype
See references/format-patterns.md for complete templates.
Step 4: Add Retrieval Instructions
For long documents, add explicit grounding instructions:
Find quotes from the documents that are relevant to [TASK]. Place these in <quotes> tags with document index references. Then, based on these quotes, [PERFORM ANALYSIS].
This helps Claude cut through noise and ground responses in evidence.
Format Selection Guide
When to Use XML
- Multi-document corpora requiring source attribution
- Content with hierarchical structure
- When parseability of output matters
- Combining with chain-of-thought (
,<thinking>
)<answer>
When to Use JSON
- State tracking (test results, task status)
- API request/response data
- When schema enforcement is needed
- Structured output requirements
When to Use Markdown Tables
- Comparisons and rankings
- Tabular data under ~50 rows
- When human readability matters
When to Use Plain Text
- Instructions and queries
- Progress notes and narrative
- Simple, single-purpose content
Critical Rules
- Never bury instructions in data - Instructions surrounded by data get attention-diluted
- Reference documents explicitly - Avoid "this document"; use
document index="3" - Use consistent tag names - Same vocabulary throughout; reference by name in instructions
- Nest semantically -
for hierarchical content<outer><inner></inner></outer> - Include source attribution - Every document wrapper needs
<source>
Common Pitfalls
See references/pitfalls.md for detailed mitigations.
| Pitfall | Quick Fix |
|---|---|
| Lost in the middle | Put critical info at START or END |
| Ambiguous references | Use explicit index numbers |
| Context overflow | Pre-calculate tokens; split strategically |
| Format mismatch | Match structure complexity to data complexity |
Transformation Script
For automated restructuring, use the transform script:
python scripts/transform_context.py input.json --format xml --output optimized.xml python scripts/transform_context.py input.csv --format xml --add-metadata --output optimized.xml
See script help for all options:
python scripts/transform_context.py --help