Claude-skill-registry batch-quality
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/batch-quality" ~/.claude/skills/majiayu000-claude-skill-registry-batch-quality && rm -rf "$T"
manifest:
skills/data/batch-quality/SKILL.mdsource content
Batch Quality Skill
Prevent wasted LLM calls by validating quality BEFORE running full batch operations.
What This Skill Actually Does
Unlike simple file-existence checks, this skill:
- Actually runs LLM on N samples using scillm
- Validates JSON response structure (excerpts, source_quality, etc.)
- Uses SPARTA contracts for DuckDB validation queries
- Integrates with task-monitor for enforced quality gates
Quick Start
cd .pi/skills/batch-quality # Preflight: Test 3 samples through actual LLM uv run python cli.py preflight \ --stage 05 \ --run-id run-recovery-verify \ --samples 3 # If preflight passes, run your batch # ...batch operation... # Validate: Check DuckDB against contract uv run python cli.py validate \ --stage 05 \ --run-id run-recovery-verify \ --task-name "sparta-stage-05"
Commands
preflight
Test N samples through actual LLM before running full batch.
uv run python cli.py preflight \ --stage <stage-name> \ --run-id <sparta-run-id> \ --samples 3 \ --prompt <optional-prompt-file>
What it actually does:
- Loads SPARTA contract for the stage (if exists)
- Checks environment variables (CHUTES_API_KEY, CHUTES_TEXT_MODEL)
- Connects to DuckDB for the run
- Samples N items from the input queue
- Runs each sample through scillm (actual LLM call)
- Validates JSON response structure
- Requires 50%+ samples to pass
Exit codes:
- 0: PASSED - safe to proceed
- 1: FAILED - fix issues first
validate
Validate batch output using SPARTA contracts.
uv run python cli.py validate \ --stage <stage-name> \ --run-id <sparta-run-id> \ --task-name <task-monitor-name>
What it actually does:
- Loads SPARTA contract (e.g.,
)05_extract_knowledge.json - Runs all
from contract against DuckDBvalidation_queries - Checks each query result against
expected_min - Notifies task-monitor of pass/fail
Contract example (
):05_extract_knowledge.json
{ "validation_queries": [ {"name": "url_knowledge_count", "query": "SELECT COUNT(*) FROM url_knowledge", "expected_min": 10}, {"name": "urls_processed", "query": "SELECT COUNT(*) FROM url_extraction_log WHERE ok = true", "expected_min": 5} ] }
status
Check current preflight status (JSON output).
uv run python cli.py status
clear
Clear preflight state (requires new preflight).
uv run python cli.py clear
SPARTA Pipeline Integration
# 1. Register task with validation requirement uv run python .pi/skills/task-monitor/monitor.py register \ --name "sparta-stage-05" \ --require-validation # 2. Run preflight (ACTUALLY tests LLM) uv run python .pi/skills/batch-quality/cli.py preflight \ --stage 05 \ --run-id run-recovery-verify \ --samples 3 # 3. Run batch (only if preflight passed) uv run python -m sparta.pipeline_duckdb.05_extract_knowledge \ --run-id run-recovery-verify # 4. Validate using contract queries uv run python .pi/skills/batch-quality/cli.py validate \ --stage 05 \ --run-id run-recovery-verify \ --task-name "sparta-stage-05"
Configuration
Environment variables:
: Path to SPARTA project (default:SPARTA_ROOT
)/home/graham/workspace/experiments/sparta
: API key for LLM callsCHUTES_API_KEY
: API base URL (default:CHUTES_API_BASE
)https://llm.chutes.ai/v1
: Model ID for text extractionCHUTES_TEXT_MODEL
Contract location:
$SPARTA_ROOT/tools/pipeline_gates/fixtures/D3-FEV/contracts/
Dependencies
- CLI frameworktyper
- Database queriesduckdb
- LLM batch processing (for actual sample testing)scillm
Key Principle
Preflight is cheap. Failed batches are expensive.
Testing 3 samples costs ~$0.01 and takes 30 seconds. Running 1000 items with a broken prompt costs ~$3 and takes hours.