Openclaw-superpowers context-assembly-scorer

Scores how well the current context represents the full conversation — detects information blind spots, stale summaries, and coverage gaps that cause the agent to forget critical details.

install
source · Clone the upstream repo
git clone https://github.com/ArchieIndian/openclaw-superpowers
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ArchieIndian/openclaw-superpowers "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/openclaw-native/context-assembly-scorer" ~/.claude/skills/archieindian-openclaw-superpowers-context-assembly-scorer && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ArchieIndian/openclaw-superpowers "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/openclaw-native/context-assembly-scorer" ~/.openclaw/skills/archieindian-openclaw-superpowers-context-assembly-scorer && rm -rf "$T"
manifest: skills/openclaw-native/context-assembly-scorer/SKILL.md
source content

Context Assembly Scorer

What it does

When an agent compacts context, it loses information. But how much? And which information? Context Assembly Scorer answers these questions by measuring coverage — the ratio of important topics in the full conversation history that are represented in the current assembled context.

Inspired by lossless-claw's context assembly system, which carefully selects which summaries to include in each turn's context to maximize information coverage.

When to invoke

  • Automatically every 4 hours (cron) — silent coverage check
  • Before starting a task that depends on prior context — verify nothing critical is missing
  • After compaction — measure information loss
  • When the agent says "I don't remember" — diagnose why

Coverage dimensions

DimensionWhat it measuresWeight
Topic coverage% of conversation topics present in current context2x
Recency biasWhether recent context is over-represented vs. older important context1.5x
Entity continuityNamed entities (files, people, APIs) mentioned in history that are missing from context2x
Decision retentionArchitectural decisions and user preferences still accessible2x
Task continuityActive/pending tasks that might be lost after compaction1.5x

How to use

python3 score.py --score                      # Score current context assembly
python3 score.py --score --verbose             # Detailed per-dimension breakdown
python3 score.py --blind-spots                 # List topics missing from context
python3 score.py --drift                       # Compare current vs. previous scores
python3 score.py --status                      # Last score summary
python3 score.py --format json                 # Machine-readable output

Procedure

Step 1 — Score context coverage

python3 score.py --score

The scorer reads MEMORY.md (full history) and compares it against what's currently accessible. Outputs a coverage score from 0–100% with a letter grade.

Step 2 — Find blind spots

python3 score.py --blind-spots

Lists specific topics, entities, and decisions that exist in full history but are missing from current context — these are what the agent has effectively "forgotten."

Step 3 — Track drift over time

python3 score.py --drift

Shows how coverage has changed across the last 20 scores. Identify if compaction is progressively losing more information.

Grading

GradeCoverageMeaning
A90–100%Excellent — minimal information loss
B75–89%Good — minor gaps, unlikely to cause issues
C60–74%Fair — some important context missing
D40–59%Poor — significant blind spots
F0–39%Critical — agent is operating with major gaps

State

Coverage scores and blind spot history stored in

~/.openclaw/skill-state/context-assembly-scorer/state.yaml
.

Fields:

last_score_at
,
current_score
,
blind_spots
,
score_history
.

Notes

  • Read-only — does not modify context or memory
  • Topic extraction uses keyword clustering, not LLM calls
  • Entity detection uses regex patterns for file paths, URLs, class names, API endpoints
  • Decision detection looks for markers: "decided", "chose", "prefer", "always", "never"
  • Recency bias is measured as the ratio of recent-vs-old entry representation