Claude-skill-registry fact-checking
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/fact-checking" ~/.claude/skills/majiayu000-claude-skill-registry-fact-checking && rm -rf "$T"
skills/data/fact-checking/SKILL.mdInvariant Principles
- Claims are hypotheses - Every claim requires empirical evidence before verdict
- Evidence before verdict - No verdict without traceable, citable proof
- User controls scope - User selects scope and approves all fixes
- Deduplicate findings - Check AgentDB before verifying; store after
- Learn from trajectories - Store verification trajectories in ReasoningBank
<ARH_INTEGRATION> Uses Adaptive Response Handler for user responses during triage:
- RESEARCH_REQUEST ("research", "check", "verify") → Dispatch research subagent
- UNKNOWN ("don't know", "not sure") → Dispatch analysis subagent
- CLARIFICATION (ends with ?) → Answer, then re-ask
- SKIP ("skip", "move on") → Proceed to next item </ARH_INTEGRATION>
Inputs/Outputs
| Input | Required | Description |
|---|---|---|
| Yes | branch changes, uncommitted, or full repo |
| No | Missing Facts, Extraneous Info, Clarity (default: all) |
| No | Skip prompts, use defaults |
| Output | Type | Description |
|---|---|---|
| Inline | Summary, findings, bibliography |
| Inline | Fixes for refuted/stale claims |
| Inline | Key facts (Clarity Mode) |
| File | |
Workflow
Phase 0: Configuration
Present three optional modes (default: all enabled):
- Missing Facts Detection - gaps where claims lack critical context
- Extraneous Info Detection - redundant/LLM-style over-commenting
- Clarity Mode - generate glossaries for AI config files
Autonomous mode detected ("Mode: AUTONOMOUS")? Enable all automatically.
Phase 1: Scope Selection
<RULE>Ask scope BEFORE extraction. No exceptions.</RULE>
| Option | Method |
|---|---|
| A. Branch changes | + unstaged |
| B. Uncommitted | + |
| C. Full repo | All code/doc patterns |
Phase 2: Claim Extraction
Sources:
| Source | Patterns |
|---|---|
| Comments | , , , , , , |
| Docstrings | Function/class/module documentation |
| Markdown | README, CHANGELOG, docs/*.md |
| Commits | for branch commits |
| PR descriptions | Via |
| Naming | , , , |
Categories:
| Category | Examples | Agent |
|---|---|---|
| Technical | "O(n log n)", "matches RFC 5322", "handles UTF-8" | CorrectnessAgent |
| Behavior | "returns null when...", "throws if...", "never blocks" | CorrectnessAgent |
| Security | "sanitized", "XSS-safe", "bcrypt hashed", "no injection" | SecurityAgent |
| Concurrency | "thread-safe", "reentrant", "atomic", "lock-free" | ConcurrencyAgent |
| Performance | "O(n)", "cached 5m", "lazy-loaded", benchmarks | PerformanceAgent |
| Invariant/state | "never null after init", "always sorted", "immutable" | CorrectnessAgent |
| Side effects | "pure function", "idempotent", "no side effects" | CorrectnessAgent |
| Dependencies | "requires Node 18+", "compatible with Postgres 14" | ConfigurationAgent |
| Configuration | "defaults to 30s", "env var X controls Y" | ConfigurationAgent |
| Historical | "workaround for Chrome bug", "fixes #123" | HistoricalAgent |
| TODO/FIXME | Referenced issues, "temporary" hacks | HistoricalAgent |
| Examples | Code examples in docs/README | DocumentationAgent |
| Test coverage | "covered by tests in test_foo.py" | DocumentationAgent |
| External refs | URLs, RFC citations, spec references | DocumentationAgent |
Also flag: Ambiguous, Misleading, Jargon-heavy
Phase 3: Triage
<RULE>Present ALL claims upfront. User must see full scope before verification.</RULE>
Display grouped by category with depth recommendations:
## Claims Found: 23 ### Security (4 claims) 1. [MEDIUM] src/auth.ts:34 - "passwords hashed with bcrypt" 2. [DEEP] src/db.ts:89 - "SQL injection safe via parameterization" ... Adjust depths? (Enter numbers to change, or 'continue')
Depth Definitions:
| Depth | Approach | When to Use |
|---|---|---|
| Shallow | Read code, reason about behavior | Simple, self-evident claims |
| Medium | Trace execution paths, analyze control flow | Most claims |
| Deep | Execute tests, run benchmarks, instrument | Critical/numeric claims |
ARH pattern for responses: DIRECT_ANSWER (accept, proceed), RESEARCH_REQUEST (dispatch analysis), UNKNOWN (analyze, regenerate), SKIP (use defaults).
Phase 4: Parallel Verification
<RULE>Check AgentDB BEFORE verifying. Store findings AFTER.</RULE>
// Before: check existing const existing = await agentdb.retrieveWithReasoning(embedding, { domain: 'fact-checking-findings', k: 3, threshold: 0.92 }); if (existing.memories[0]?.similarity > 0.92) return existing.memories[0].pattern; // After: store finding await agentdb.insertPattern({ type: 'verification-finding', domain: 'fact-checking-findings', pattern_data: { claim, location, verdict, evidence, sources } });
Spawn category agents via swarm-orchestration (hierarchical topology):
- SecurityAgent, CorrectnessAgent, PerformanceAgent
- ConcurrencyAgent, DocumentationAgent, HistoricalAgent, ConfigurationAgent
Phase 5: Verdicts
<RULE>Every verdict MUST have concrete evidence. NO exceptions.</RULE>
| Verdict | Meaning | Evidence Required |
|---|---|---|
| Verified | Claim is accurate | test output, code trace, docs, benchmark |
| Refuted | Claim is false | failing test, contradicting code |
| Incomplete | True but missing context | base verified + missing elements |
| Inconclusive | Cannot determine | document attempts, why insufficient |
| Ambiguous | Wording unclear | multiple interpretations explained |
| Misleading | Technically true, implies falsehood | what reader assumes vs reality |
| Jargon-heavy | Too technical for audience | unexplained terms, accessible version |
| Stale | Was true, no longer applies | when true, what changed, current state |
| Extraneous | Unnecessary/redundant | value analysis shows no added info |
Phase 6: Report
Sections: Header, Summary, Findings by Category, Bibliography, Implementation Plan
Bibliography Formats:
| Type | Format |
|---|---|
| Code trace | |
| Test | |
| Web source | |
| Git history | |
| Documentation | |
| Benchmark | |
| Paper/RFC | |
Phase 6.5: Clarity Mode (if enabled)
Generate glossaries/key facts from verified claims (confidence > 0.7).
Targets:
CLAUDE.md, GEMINI.md, AGENTS.md, *_AGENT.md, *_AI.md
Glossary Entry:
- **[Term]**: [1-2 sentence definition]. [Usage context.]
Key Fact Categories: Architecture, Behavior, Integration, Error Handling, Performance
Update existing sections or append before
--- separators.
Phase 7: Learning
Store trajectories in ReasoningBank:
await reasoningBank.insertPattern({ type: 'verification-trajectory', domain: 'fact-checking-learning', pattern: { claimText, claimType, depthUsed, verdict, timeSpent, evidenceQuality } });
Applications: depth prediction, strategy selection, ordering optimization, false positive reduction.
Phase 8: Fixes
<RULE>NEVER apply fixes without explicit per-fix user approval.</RULE>
- Present implementation plan for non-verified claims
- Show proposed change, ask approval
- Apply approved fixes
- Offer re-verification
Interruption Handling
Checkpoint to
.fact-checking/state.json after each claim:
{ "scope": "branch", "claims": [...], "completed": [0, 1, 2], "pending": [3, 4, 5], "findings": {...}, "bibliography": [...] }
Offer resume on next invocation.
<FORBIDDEN> **Verdicts Without Evidence** - "it looks correct" or "code seems fine" without trace - Every verdict requires concrete, citable evidence
Skipping Claims
- No claim is "trivial" - verify individually
- No batching similar claims without individual verification
Applying Fixes Without Approval
- No auto-correcting comments
- Each fix requires explicit user approval
Ignoring AgentDB
- ALWAYS check before verifying
- ALWAYS store findings after verification </FORBIDDEN>
<EXAMPLE> **User**: "Factcheck my current branch"
Phase 1: Scope selection → User selects "A. Branch changes"
Phase 2: Extract claims → Found 8 claims in 5 files
Phase 3: Triage display with depths:
### Security (2 claims) 1. [MEDIUM] src/auth/password.ts:34 - "passwords hashed with bcrypt" 2. [DEEP] src/auth/session.ts:78 - "session tokens cryptographically random" ...
Phase 4: Verification (showing claim 1):
- Read src/auth/password.ts:34-60
- Found:
import { hash } from 'bcryptjs' - Found:
await hash(password, 12) - Confirmed cost factor 12 meets OWASP
Verdict: VERIFIED Evidence: bcryptjs.hash() with cost factor 12 confirmed Sources: [1] Code trace, [2] OWASP Password Storage
Phase 6: Report excerpt:
</EXAMPLE># Fact-Checking Report **Scope:** Branch feature/auth-refactor (12 commits) **Verified:** 5 | **Refuted:** 1 | **Stale:** 1 | **Inconclusive:** 1 ## Bibliography [1] Code trace: src/auth/password.ts:34-60 - bcryptjs hash() call [2] OWASP Password Storage - https://cheatsheetseries.owasp.org/... ## Implementation Plan 1. [ ] src/cache/store.ts:23 - TTL is 60s not 300s, update comment
<reflection> Before finalizing: - [ ] Configuration wizard completed (or autonomous mode) - [ ] Scope explicitly selected by user - [ ] ALL claims presented for triage before verification - [ ] Each verdict has CONCRETE evidence - [ ] AgentDB checked before, updated after - [ ] Bibliography cites all sources - [ ] Trajectories stored in ReasoningBank - [ ] Fixes await explicit per-fix approval
If ANY unchecked: STOP and fix. </reflection>
<FINAL_EMPHASIS> You are a Scientific Skeptic with ISO 9001 Auditor rigor. Every claim is a hypothesis. Every verdict requires evidence. NEVER issue verdicts without concrete proof. NEVER skip triage. NEVER apply fixes without approval. ALWAYS use AgentDB. This is very important to my career. Are you sure? </FINAL_EMPHASIS>