Memstack compress

Use when the user says 'headroom', 'compression', 'token savings', 'proxy status', or asks about context window usage.

install
source · Clone the upstream repo
git clone https://github.com/cwinvestments/memstack
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/cwinvestments/memstack "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/compress" ~/.claude/skills/cwinvestments-memstack-compress && rm -rf "$T"
manifest: skills/compress/SKILL.md
source content

⚙️ Compress — Headroom Proxy Manager

Monitor and manage Headroom context compression for CC sessions.

Activation

When this skill activates, output:

⚙️ Compress — Checking Headroom status...

Then execute the protocol below.

  • Keywords: headroom, compression stats, token savings, proxy status, check headroom
  • Contextual: When user asks about token usage, context window limits, or session cost optimization
  • Level: 2 (explicit trigger only)

Context Guard

ContextStatus
User says "headroom", "compression stats", "check proxy"ACTIVE — run status check
User asks about token savings or context windowACTIVE — run session report
Proxy errors or API connection failures appearACTIVE — run health diagnostics
General discussion about CC featuresDORMANT — do not activate
User is actively coding (no proxy issues)DORMANT — do not activate

What It Does

Headroom is a transparent proxy between Claude Code and the Anthropic API that compresses tool outputs by removing redundant boilerplate. It extends effective context window by 30–40%.

This skill checks proxy health, reports compression stats, and troubleshoots connection issues.

Prerequisites

  • Headroom installed:
    pip install headroom-ai[code]
    • The
      [code]
      extra installs tree-sitter for AST-based code compression. Without it, Code-Aware compression is disabled and CC sessions get 0% compression.
  • Proxy running:
    headroom proxy --llmlingua-device cpu
    (defaults to
    localhost:8787
    )
  • CC configured:
    ANTHROPIC_BASE_URL=http://127.0.0.1:8787

Recommended Startup

headroom proxy --llmlingua-device cpu
  • --llmlingua-device cpu
    — Forces LLMLingua to use CPU (avoids silent CUDA failures on machines without GPU)
  • Default port is 8787, no other flags needed
  • Code-Aware and LLMLingua both load lazily when relevant content is detected

Workflow

1. Status Check

Run:

curl -s http://127.0.0.1:8787/stats | python -m json.tool

Report: proxy up/down, requests processed, compression ratio, tokens saved, estimated cost savings.

2. Health Diagnostics

If proxy is unreachable:

  1. Check if process is running:
    # Windows
    tasklist | findstr headroom
    # Linux/macOS
    ps aux | grep headroom
    
  2. Check port binding:
    netstat -ano | findstr 8787
    
  3. Verify
    ANTHROPIC_BASE_URL
    is set:
    echo $ANTHROPIC_BASE_URL
    
  4. Restart:
    headroom proxy
    in a separate terminal

3. Session Report

When triggered at session end or on request, report:

  • Requests this session
  • Tokens before/after compression
  • Compression ratio (target: 30–40%)
  • Estimated dollar savings (at $15/MTok input, $75/MTok output for Opus)

4. Configuration Reference

SettingValueNotes
Proxy URL
http://127.0.0.1:8787
Default port
DashboardAdminStack Infrastructure tabHeadroom monitoring panel
Repo
github.com/chopratejas/headroom
Apache 2.0
Python3.14 compatibleTested Feb 2026

Troubleshooting

SymptomFix
0% compression / 0.00x ratio
headroom-ai[code]
is not installed. Run:
pip install headroom-ai[code]
. Restart proxy.
"Code-Aware: NOT INSTALLED" in startup bannerSame fix — install the
[code]
extra and restart.
Cost figures don't match Anthropic ConsoleHeadroom estimates costs at list token prices without accounting for Anthropic's server-side prompt caching discounts. For actual costs, check console.anthropic.com.

Output Format

⚙️ Headroom Status
├── Proxy: ✅ Running on :8787
├── Requests: 47 processed
├── Compression: 46.2% reduction
├── Tokens saved: ~18,500 tokens
└── Cost savings: ~$0.28 this session

Integration

  • AdminStack: Infrastructure page has Headroom tab with live dashboard
  • CC Sessions: Auto-routed when
    ANTHROPIC_BASE_URL
    is set
  • Monitoring: Stats endpoint polled every 30s with visibility-aware polling

Level History

  • Lv.1 — Base: Health check and stats reporting for Headroom proxy. (Origin: MemStack v3.0, Feb 2026)
  • Lv.2 — Fixed: Added
    [code]
    extra for tree-sitter AST compression, updated startup flags (
    --llmlingua-device cpu
    ), added troubleshooting. Compression 0% → 46%. (Feb 24, 2026)