Babysitter prompt-compression

Token-efficient prompt compression techniques for cost optimization

install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/ai-agents-conversational/skills/prompt-compression" ~/.claude/skills/a5c-ai-babysitter-prompt-compression && rm -rf "$T"
manifest: library/specializations/ai-agents-conversational/skills/prompt-compression/SKILL.md
source content

Prompt Compression Skill

Capabilities

  • Implement token-efficient prompt compression
  • Design context pruning strategies
  • Configure selective context inclusion
  • Implement LLMLingua-style compression
  • Design summary-based compression
  • Create compression quality metrics

Target Processes

  • cost-optimization-llm
  • agent-performance-optimization

Implementation Details

Compression Techniques

  1. LLMLingua: Token-level compression
  2. Summary Compression: LLM-based summarization
  3. Selective Context: Relevant section extraction
  4. Token Pruning: Remove low-importance tokens
  5. Document Filtering: Pre-retrieval filtering

Configuration Options

  • Compression ratio targets
  • Quality threshold settings
  • Token budget constraints
  • Compression model selection
  • Evaluation metrics

Best Practices

  • Monitor quality vs compression tradeoff
  • Test with representative prompts
  • Set appropriate compression ratios
  • Validate compressed prompt quality
  • Track cost savings

Dependencies

  • llmlingua (optional)
  • tiktoken
  • transformers