Claude-skill-registry citation-extraction

Extract and validate citations with Jaccard overlap calculation. Ensures all claims have verifiable source evidence with ≥0.7 alignment threshold.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/citation-extraction" ~/.claude/skills/majiayu000-claude-skill-registry-citation-extraction && rm -rf "$T"
manifest: skills/data/citation-extraction/SKILL.md
source content

Citation Extraction Skill

Overview

Validates citations by calculating Jaccard overlap (intersection over union) between cited text and source passages. Enforces ≥0.7 threshold for quality.

When to Use

  • Validate citations in clinical summaries
  • Calculate Jaccard overlap for Evaluator metrics
  • Extract source text spans for claims

Installation

IMPORTANT: This skill has its own isolated virtual environment (

.venv
) managed by
uv
. Do NOT use system Python.

Initialize the skill's environment:

# From the skill directory
cd .agent/skills/citation-extraction
uv sync  # Creates .venv (no external dependencies, uses Python stdlib)

Usage

CRITICAL: Always use

uv run
to execute code with this skill's
.venv
, NOT system Python.

# From .agent/skills/citation-extraction/ directory
# Run with: uv run python -c "..."
from citation_extraction import CitationExtractor

extractor = CitationExtractor()

# Validate citation
jaccard = extractor.calculate_jaccard_overlap(
    claim_text="Patient has hypertension",
    source_text="Patient diagnosed with hypertension 10 years ago"
)

if jaccard >= 0.7:
    print(f"Valid citation (Jaccard: {jaccard})")

Implementation

See

citation_extraction.py
.