Openclaw-superpowers context-assembly-scorer

Scores how well the current context represents the full conversation — detects information blind spots, stale summaries, and coverage gaps that cause the agent to forget critical details.

install

source · Clone the upstream repo

git clone https://github.com/ArchieIndian/openclaw-superpowers

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ArchieIndian/openclaw-superpowers "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/openclaw-native/context-assembly-scorer" ~/.claude/skills/archieindian-openclaw-superpowers-context-assembly-scorer && rm -rf "$T"

OpenClaw · Install into ~/.openclaw/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ArchieIndian/openclaw-superpowers "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/openclaw-native/context-assembly-scorer" ~/.openclaw/skills/archieindian-openclaw-superpowers-context-assembly-scorer && rm -rf "$T"

manifest: skills/openclaw-native/context-assembly-scorer/SKILL.md

source content

Context Assembly Scorer

What it does

When an agent compacts context, it loses information. But how much? And which information? Context Assembly Scorer answers these questions by measuring coverage — the ratio of important topics in the full conversation history that are represented in the current assembled context.

Inspired by lossless-claw's context assembly system, which carefully selects which summaries to include in each turn's context to maximize information coverage.

When to invoke

Automatically every 4 hours (cron) — silent coverage check
Before starting a task that depends on prior context — verify nothing critical is missing
After compaction — measure information loss
When the agent says "I don't remember" — diagnose why

Coverage dimensions

Dimension	What it measures	Weight
Topic coverage	% of conversation topics present in current context	2x
Recency bias	Whether recent context is over-represented vs. older important context	1.5x
Entity continuity	Named entities (files, people, APIs) mentioned in history that are missing from context	2x
Decision retention	Architectural decisions and user preferences still accessible	2x
Task continuity	Active/pending tasks that might be lost after compaction	1.5x

How to use

python3 score.py --score                      # Score current context assembly
python3 score.py --score --verbose             # Detailed per-dimension breakdown
python3 score.py --blind-spots                 # List topics missing from context
python3 score.py --drift                       # Compare current vs. previous scores
python3 score.py --status                      # Last score summary
python3 score.py --format json                 # Machine-readable output

Procedure

Step 1 — Score context coverage

python3 score.py --score

The scorer reads MEMORY.md (full history) and compares it against what's currently accessible. Outputs a coverage score from 0–100% with a letter grade.

Step 2 — Find blind spots

python3 score.py --blind-spots

Lists specific topics, entities, and decisions that exist in full history but are missing from current context — these are what the agent has effectively "forgotten."

Step 3 — Track drift over time

python3 score.py --drift

Shows how coverage has changed across the last 20 scores. Identify if compaction is progressively losing more information.

Grading

Grade	Coverage	Meaning
A	90–100%	Excellent — minimal information loss
B	75–89%	Good — minor gaps, unlikely to cause issues
C	60–74%	Fair — some important context missing
D	40–59%	Poor — significant blind spots
F	0–39%	Critical — agent is operating with major gaps

State

Coverage scores and blind spot history stored in

~/.openclaw/skill-state/context-assembly-scorer/state.yaml

Fields:

last_score_at

current_score

blind_spots

score_history

Notes

Read-only — does not modify context or memory
Topic extraction uses keyword clustering, not LLM calls
Entity detection uses regex patterns for file paths, URLs, class names, API endpoints
Decision detection looks for markers: "decided", "chose", "prefer", "always", "never"
Recency bias is measured as the ratio of recent-vs-old entry representation