Learn-skills.dev blog-factcheck

install
source · Clone the upstream repo
git clone https://github.com/NeverSight/learn-skills.dev
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/agricidaniel/claude-blog/blog-factcheck" ~/.claude/skills/neversight-learn-skills-dev-blog-factcheck && rm -rf "$T"
manifest: data/skills-md/agricidaniel/claude-blog/blog-factcheck/SKILL.md
source content

Blog Fact-Check

Verify statistics, claims, and source attributions in blog posts. Pure Claude pipeline with no external NLP dependencies.

Workflow

Step 1: Read the Blog Post

Read the target file and identify all sections containing data claims.

Step 2: Extract Statistical Claims

Scan the full text for every claim that includes a number, percentage, dollar amount, or named source. Build a claims list with these fields:

FieldDescription
claim_textThe exact sentence or phrase containing the statistic
valueThe numeric value (e.g., "42%", "$1.2M", "3x")
attributionNamed source if present (e.g., "HubSpot", "Gartner 2025")
urlCited URL if present (from markdown link or parenthetical)
locationHeading or line number where the claim appears

Step 3: Verify Cited Claims

For each claim that includes a URL:

  1. Fetch the source page via WebFetch
  2. Search the returned content for the specific numeric value
  3. If exact value found, check surrounding context matches the claim topic
  4. Assign a confidence score (see Verification Scoring below)

Process claims sequentially to avoid rate-limiting source sites.

Step 4: Flag Uncited Claims

For claims without a URL:

  • Mark status as UNVERIFIED
  • Suggest a search query the user can run to find a source
  • If the attribution names a specific organization, suggest their domain

Step 5: Generate Verification Report

Output the full results table, summary statistics, and recommended actions.

Claim Extraction Patterns

Identify claims matching these structures:

Fully cited (highest priority):

  • [Number]% [claim] ([Source], [Year])
    - parenthetical citation
  • [claim] [Number]% ... [markdown link to source]
    - inline link
  • According to [Source], [Number]...
    - attribution lead

Uncited statistics (flag for sourcing):

  • [Number]% of [noun phrase]
    - standalone percentage
  • [Number]x more/less/higher/lower
    - multiplier claims
  • $[Number] [claim]
    - dollar figures without attribution

Weak signals (check context before extracting):

  • studies show
    ,
    research indicates
    ,
    data suggests
    + nearby number
  • survey found
    ,
    report reveals
    ,
    analysis shows
    + nearby number
  • Round numbers in isolation (e.g., "millions of users") - skip unless specific

Verification Scoring

ScoreStatusCriteria
1.0VERIFIEDExact number found on cited page in matching context
0.7-0.9PARAPHRASESimilar data found but with different wording, rounding, or timeframe
0.3-0.6WEAKSource page exists and covers the topic but the specific statistic is not visible
0.0NOT FOUNDCited page does not contain the claimed data anywhere
N/AUNVERIFIEDNo source URL provided for the claim

Scoring guidance:

  • A claim of "43%" when the source says "nearly half" scores 0.8
  • A claim of "2024" data when the source only has "2023" scores 0.7
  • A claim citing a homepage when the stat lives on a subpage scores 0.3
  • A 404 or unreachable URL scores 0.0

Output Format

Verification Report: [Post Title]

File: [path] Claims found: [total] Verified: [count] | Paraphrase: [count] | Weak: [count] | Not Found: [count] | Unverified: [count]

#ClaimSource URLScoreStatusNotes
1"73% of marketers..."https://example.com/report1.0VERIFIEDExact match found in section 3
2"5x ROI improvement"https://example.com/study0.8PARAPHRASESource says "nearly 5x"
3"60% prefer video"(none)N/AUNVERIFIEDTry: "video preference statistics 2025"

Recommended Actions

  • [List claims that need source URLs]
  • [List claims with weak or not-found scores that need replacement sources]
  • [List claims where the source data may be outdated]

Integration

This skill can be called from

blog-analyze
as an optional deep-verification step. When invoked from the analyzer, only claims scoring below 0.7 are flagged in the analysis report.

Standalone usage:

/blog factcheck path/to/post.md

Limitations

  • Paywalled content: WebFetch cannot access content behind login walls. These score as WEAK (0.5) with a note about paywall detection.
  • Dynamic pages: JavaScript-rendered content may not be available via WebFetch. If the page returns minimal content, note this in the status.
  • PDF sources: WebFetch may not extract PDF text reliably. Flag PDF URLs for manual verification.
  • Archived pages: If a URL returns 404, suggest checking web.archive.org.
  • Rate limits: Process no more than 10 URLs per run to avoid overwhelming source servers. If a post has more than 10 cited URLs, verify the first 10 and list the remainder as SKIPPED.