AlterLab-Academic-Skills alterlab-link-health
Part of the AlterLab Academic Skills suite. Meta-skill for auditing and repairing Markdown link health across a skills repo. Runs a four-tier pipeline (config hardening, intra-repo file-ref fixes, external URL substitutions, residual exclusions) and enforces a Tier 3 substitution guardrail that prevents regressions of previously-passing links. Designed for lychee-based GitHub Actions link checkers, but the methodology generalizes to markdown-link-check and similar tools. Triggers on: link audit, dead links, link health, lychee, broken links, link checker, markdown link audit, link-health audit, 404 audit, check-links failing, CI link-check, 連結健檢, 死鏈, 失效連結, 斷鏈檢查.
git clone https://github.com/AlterLab-IEU/AlterLab-Academic-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/AlterLab-IEU/AlterLab-Academic-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/core/alterlab-link-health" ~/.claude/skills/alterlab-ieu-alterlab-academic-skills-alterlab-link-health && rm -rf "$T"
skills/core/alterlab-link-health/SKILL.mdLink Health — Repo-wide Markdown Link Audit Methodology
A reusable methodology for bringing a broken docs-heavy repo's link checker to green. Codified from a real audit that took
AlterLab-IEU/AlterLab-Academic-Skills from 1208 errors out of 1966 links to 0 errors out of 1912 links across 8 commits, with an auto-detected Tier 3 regression that validated the guardrail rule.
Quick Start
Full audit (fresh repo, failing link checker):
Audit and repair the link health of <owner/repo>. Run the full four-tier pipeline.
→ Dispatch the 10-agent audit from
playbooks/full-audit.md, then the tiered APPLY phase.
Targeted residual pass (first dispatch reduced errors but some remain):
The link checker is down from 1208 to 67 errors. Close the residuals.
→ Dispatch the 3-agent followup pass from
playbooks/followup-pass.md.
Post-merge cleanup (PR is green, need to finalize human-decision items):
Finalize the post-merge cleanup: resolve pending human-decision items, file follow-up issues, document link debt.
→ Dispatch the 4-agent post-merge pass from
playbooks/post-merge.md.
Trigger Conditions
Trigger Keywords
English: link audit, dead links, link health, lychee, broken links, link checker, markdown link audit, link-health audit, 404 audit, check-links failing, CI link-check
繁體中文: 連結健檢, 死鏈, 失效連結, 斷鏈檢查, 連結審計
When This Skill Applies
- A weekly
(or similar lychee / markdown-link-check) workflow has been failing.Check Links - The user mentions a large error count (hundreds+) that they suspect is mostly config-driven false positives.
- The user wants to refactor broken intra-repo file references across many skills / docs.
- The user wants a reusable process for link debt maintenance going forward.
Non-Trigger Scenarios
| Scenario | Skill / Tool to Use Instead |
|---|---|
| Fix a single broken link in a single file | Direct Edit — no pipeline needed |
| Add a new URL to skill docs | or the relevant domain skill |
| Audit citations (DOI resolution, author verification) | integrity-check mode |
| Audit repo structure beyond links (schema, metadata) | Separate audit (out of scope) |
Pipeline Overview (4 Tiers)
Each tier is a single reviewable commit. Run them in order — each unlocks the next by making the error signal cleaner.
| Tier | Scope | Typical Delta |
|---|---|---|
| 1 — Config | Introduce with an additive accept set, a for permanent noise hosts, and a hardened CI workflow. | Biggest single win — often -70% to -90% of errors. Fixes the " replaces the default set" gotcha. |
| 2 — Intra-repo refs | Repair entries: singular/plural directory typos, missing path prefixes, YAML frontmatter bugs. Wrap pedagogical placeholder paths as inline code. | Eliminates the bulk of real breakage — usually 200-400 entries collapse to zero. |
| 3 — URL substitutions | Replace MOVED external URLs with verified-live substitutes; replace DEAD_INFRA URLs with replacement resources. Never substitute without verification. | Reduces residuals to the low dozens. |
| 4 — Exclusions | Everything left that cannot be fixed: bot-hostile hosts, pedagogical placeholders, expired upstream infrastructure, chronically flaky academic sites. | Gets to 0 errors or stable single-digit residuals. |
See
references/tier1-config.md through references/tier4-exclusions.md for the decision rules in each tier.
The Tier 3 Guardrail
After any URL substitution pass, re-run the link checker and diff against the baseline success set. Any URL that returned 200 OK in baseline and is non-200 after substitution is a regression and MUST be reverted before commit.
Self-check:
diff <(grep "^\[200\]" baseline.log | sort -u) \ <(grep "^\[200\]" post.log | sort -u)
Output should show no deletions, only additions. Deletions mean a substitution regressed a previously-working URL.
This rule exists because during the source audit, broad
sed prefix substitutions silently concatenated onto more-specific paths (e.g. /v3/ → /v3/docs turned an already-correct /v3/docs into /v3/docsdocs). The guardrail caught it on the second CI dispatch, not the commit itself. Assume your Tier 3 pass will have regressions. Verify.
Full detail:
references/tier3-substitution.md.
The Verification-First Rule
DEFAULT to probe-verification before any URL substitution. Unverified substitutions are how phantom URLs land in public skills. Before every
[old] → [new] replacement:
- For GitHub repos:
— status must be 200 (repo exists, not archived).gh api repos/owner/name - For HTTP URLs:
— final status must be 200 (after redirects).curl -sSI -L --max-time 15 '<new>' - For PyPI / npm / crates packages: check the registry API or landing page directly.
If verification fails, exclude the dead target via
.lycheeignore with a commented reason rather than guessing a replacement. An excluded dead link is honest; a substituted wrong link is a time bomb.
Full detail:
references/tier3-substitution.md § "Verification rules".
Playbooks
Three ready-to-dispatch prompt bundles that call this skill's tiers in the right order.
| Playbook | When to Use | Agents |
|---|---|---|
| Fresh audit, failing CI, no prior work. | 10 parallel subagents + synthesis |
| Errors significantly reduced but residuals remain. | 3 targeted subagents |
| PR green, time to resolve pending human-decision items. | 4 parallel subagents |
All three follow the same shape: pre-flight → parallel dispatch → synthesize → commit/PR/merge → verify.
References
| File | Content |
|---|---|
| schema, workflow YAML, accept-code gotchas |
| Intra-repo path audit, singular/plural directory patterns, frontmatter fixes |
| URL substitution rules, the guardrail, sed-safety patterns |
| When to exclude vs substitute, category rubric |
| The 5-category layout for maintainers |
Examples
— the source audit that generated this skill. 1208 → 0 errors, 8 commits, auto-detected Tier 3 regression in commit 5 (examples/pr-1-retrospective.md
), merged as9cbd801
.93a72fe
Scope Discipline
This skill fixes link health. It does NOT:
- Standardize SKILL.md schemas across the repo. File schema drift as a separate issue.
- Refactor skill content, examples, or prose. Only touches link URLs and the CI config.
- Modify
's accept list to mask real breakage. Flaky upstream 5xx / timeouts get excluded per-host with rationale, not blanket-accepted..lychee.toml
Scope discipline keeps the PR reviewable and the link-check signal honest.