Claude-seo-skills seo-internal-links

install
source · Clone the upstream repo
git clone https://github.com/lionkiii/claude-seo-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/lionkiii/claude-seo-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/seo-internal-links" ~/.claude/skills/lionkiii-claude-seo-skills-seo-internal-links && rm -rf "$T"
manifest: skills/seo-internal-links/SKILL.md
source content

Internal Link Audit

Crawls a domain to build an internal link graph. Identifies orphan pages, underlinked pages, and broken internal links. Suggests anchor text improvements.

Inputs

  • domain
    : Target domain URL (e.g.,
    https://example.com
    ). Include protocol.
  • max_pages
    (optional): Max pages to crawl. Default: 100. Maximum: 200.

Execution

Step 1: Crawl Site

Use

scripts/fetch_page.py
and
scripts/parse_html.py
to crawl the site:

# Fetch homepage
python3 scripts/fetch_page.py <domain>
# Parse HTML to extract links
python3 scripts/parse_html.py <html_content>

Follow internal links only (same domain hostname). Normalize URLs:

  • Remove trailing slashes (treat
    /about
    and
    /about/
    as same)
  • Remove URL fragments (
    #section
    )
  • Preserve query strings only if they appear to be content (not tracking: strip
    utm_*
    ,
    ref=
    ,
    source=
    )

Respect robots.txt: fetch

<domain>/robots.txt
first, skip disallowed paths. Cap crawl at
max_pages
. Track crawl queue (BFS order from homepage).

Step 2: Build Link Graph

For each crawled page, record:

  • Source URL
  • Target URL (all internal links found)
  • Anchor text for each link

Build adjacency structure:

  • outbound[url]
    = list of (target, anchor_text)
  • inbound[url]
    = list of (source, anchor_text)

Step 3: Calculate Per-Page Metrics

For each crawled page:

  • Inbound internal links count (links from other crawled pages pointing here)
  • Outbound internal links count (links from this page to other pages)
  • Link depth from homepage (BFS level at which this page was first discovered)

Step 4: Identify Issues

  • Orphan pages: Pages discovered via sitemap (
    <domain>/sitemap.xml
    ) or linked from other pages but with 0 inbound internal links. Fetch sitemap first to identify all known URLs.
  • Underlinked pages: Pages with < 3 inbound internal links (high priority for internal linking).
  • Excessive outbound: Pages with > 100 outbound internal links (PageRank dilution).
  • Broken internal links: For top 20 most-linked pages, verify HTTP status. Flag 4xx/5xx.

Step 5: Anchor Text Suggestions for Top 5 Underlinked Pages

For each of the 5 most underlinked pages (fewest inbound links, excluding orphans):

  1. Fetch the page and extract H1 and
    <title>
    tag
  2. Identify top 3 relevant anchor text options based on: H1 noun phrases, title keywords, page URL slug
  3. Find pages in crawl that mention related terms and could link to this page
  4. Output: target URL, suggested anchors (ranked), recommended source pages

Step 6: Optional Ahrefs Enrichment

If Ahrefs available (ToolSearch '+ahrefs'):

  • Fetch
    site-explorer-pages-by-internal-links
    for the domain
  • Cross-reference crawl findings with Ahrefs data
  • Add
    ### Ahrefs Internal Link Data
    section showing discrepancies

Output Format

## Internal Link Audit: [domain]

### Summary Stats

| Metric | Value |
|--------|-------|
| Pages crawled | N / max_pages |
| Total internal links | N |
| Avg inbound links per page | N.N |
| Orphan pages found | N |
| Underlinked pages (<3 inbound) | N |
| Pages with broken internal links | N |

### Orphan Pages

Pages with 0 inbound internal links:

| URL | Depth | Found via |
|-----|-------|-----------|
| /page | — | sitemap |

### Top 5 Underlinked Pages — Suggested Anchors

**1. [URL]** — N inbound links
- H1: "[heading]"
- Suggested anchors: "[anchor 1]", "[anchor 2]", "[anchor 3]"
- Recommended source pages: [list pages that mention related terms]

### Broken Internal Links

| Source Page | Broken Link | Status Code |
|---|---|---|

### Link Depth Distribution

| Depth | Pages | % of Total |
|-------|-------|------------|
| 1 (homepage) | 1 | X% |
| 2 | N | X% |
| 3+ | N | X% |

### Pages with Excessive Outbound Links (>100)

| Page | Outbound Links |
|------|----------------|

## Data Sources

- Internal crawl via fetch_page.py + parse_html.py ([N] pages, [date])
- [Ahrefs MCP — if used]