Claude-Skills programmatic-seo

install
source · Clone the upstream repo
git clone https://github.com/borghei/Claude-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/borghei/Claude-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/marketing/programmatic-seo" ~/.claude/skills/borghei-claude-skills-programmatic-seo && rm -rf "$T"
manifest: marketing/programmatic-seo/SKILL.md
source content

Programmatic SEO

Production-grade framework for building SEO page sets at scale. Covers the full lifecycle from keyword pattern discovery through template design, data pipeline construction, quality assurance, and post-launch optimization. Designed for deployments ranging from 50 to 100,000+ pages.


Table of Contents


When to Use vs When Not To

Use this skill when:

  • You have a repeating keyword pattern with 50+ variations
  • You have (or can acquire) structured data to populate pages
  • The search intent is consistent across variations
  • Your domain has sufficient authority to compete

Do NOT use when:

  • Each page requires unique editorial content (use content-creator instead)
  • Total addressable pages < 30 (manual content is more effective)
  • You lack a data source and would be generating thin placeholder content
  • Your domain authority is below DR 20 and competitors are DR 60+

Initial Assessment

Before designing any pSEO strategy, answer these questions. Skip nothing.

1. Opportunity Validation

QuestionWhy It MattersRed Flag
What is the repeating keyword pattern?Defines the template structurePattern is vague or inconsistent
What is the aggregate monthly search volume?Determines ROI ceiling< 5,000 aggregate monthly searches
How many unique pages can you generate?Scope the project< 50 pages (too few) or > 50K without data infrastructure
What does the SERP look like for sample queries?Competitive feasibilityPage 1 dominated by DR 80+ editorial content
Is intent informational, navigational, or transactional?Template designMixed intent across the same pattern

2. Data Source Evaluation

Rate your data source on this scale:

TierSource TypeDefensibilityExample
SProprietary first-partyUnbeatableYour product usage data, internal benchmarks
AProduct-derivedStrongAggregated user analytics, customer outcomes
BUser-generatedModerateCommunity reviews, submitted content
CLicensed exclusiveModeratePaid data feed no competitor has
DPublic aggregatedWeakGovernment data, public APIs
FScraped commodityNoneWikipedia rewrites, copied listings

Rule: Do not build pSEO on Tier F data. Google penalizes commodity rewrites. If your only data source is public and easily replicable, invest in acquiring Tier A-C data first.

3. Competitive Moat Assessment

For 5 sample queries in your pattern, analyze page 1 results:

  • What is the average Domain Rating of ranking pages?
  • Are existing results programmatic or editorial?
  • What unique data do ranking pages provide?
  • What is the content depth (word count, data richness, UX quality)?

Go/No-Go threshold: If the average DR gap between you and page 1 is > 30 AND existing results have proprietary data, the opportunity requires either a differentiated approach or domain authority building first.


The 14 Playbooks

#PlaybookPatternExampleData Requirement
1Templates"[Type] template""resume template", "invoice template"Template files + metadata
2Curation"best [category]""best CRM for startups"Product/service reviews + ratings
3Conversions"[X] to [Y]""100 USD to EUR"Conversion logic/API
4Comparisons"[X] vs [Y]""Notion vs Confluence"Feature data for both products
5Examples"[type] examples""landing page examples"Curated example collection
6Locations"[service] in [city]""coworking in Austin"Location-specific data
7Personas"[product] for [audience]""CRM for real estate"Audience-specific use cases
8Integrations"[A] + [B] integration""Slack Asana integration"Integration documentation
9Glossary"what is [term]""what is churn rate"Domain expertise
10TranslationsContent in N languagesLocalized guidesTranslation + localization data
11Directory"[category] tools""AI writing tools"Tool listings + evaluations
12Profiles"[entity name]""Stripe company profile"Entity-level data
13Statistics"[topic] statistics""SaaS churn statistics 2026"Verified statistical data
14Calculators"[topic] calculator""LTV calculator"Calculation logic + inputs

Playbook Selection Matrix

If you have...Primary PlaybookSecondary Layer
A product with many integrationsIntegrationsComparisons
A design/creative toolTemplates + ExamplesPersonas
A multi-segment audiencePersonasComparisons
Local/regional presenceLocationsDirectory
A tool/utility productCalculators + ConversionsGlossary
Deep domain expertiseGlossary + StatisticsCuration
A competitor landscape to exploitComparisons + CurationDirectory
User-generated contentExamples + DirectoryProfiles

Layering rule: Combine up to 2 playbooks per page set. Example: "Best coworking spaces in [city]" = Curation + Locations.


Keyword Pattern Mining

Step 1: Pattern Identification

Extract the repeating structure from seed keywords:

Seed: "react developer salary san francisco"
Pattern: [role] salary [city]
Variables: role (200+ options), city (500+ options)
Max pages: 200 x 500 = 100,000

Step 2: Volume Distribution Analysis

Not all variable combinations have search volume. Map the distribution:

TierVolume RangeTypical % of Total PagesStrategy
Head1,000+ monthly2-5%Priority indexation, highest content quality
Torso100-999 monthly15-25%Standard template, full deployment
Long-tail10-99 monthly40-50%Template with conditional content blocks
Zero-volume< 10 monthly20-40%Noindex OR skip unless data is uniquely valuable

Step 3: Intent Classification

For each pattern, verify intent consistency:

Intent TypeTemplate ImplicationsCTA Strategy
InformationalData-heavy, educational contentNewsletter, related content
Commercial investigationComparison tables, pros/consFree trial, demo
TransactionalPricing, availability, featuresBuy now, sign up
NavigationalBrand-specific, direct answerProduct page link

Data Pipeline Architecture

Pipeline Design

[Data Source] → [Extraction] → [Transformation] → [Enrichment] → [Validation] → [Template Population] → [Quality Check] → [Publish]

Data Quality Gates

Every record must pass these gates before page generation:

GateCheckFailure Action
CompletenessAll required fields populatedSkip page, log for manual review
AccuracyData matches source, no staleness > 90 daysFlag for refresh
UniquenessNo duplicate recordsMerge or deduplicate
Minimum richnessPage will have > 300 words of unique contentSkip or enrich
Legal complianceData usage rights verifiedBlock publication

Update Cadence

Data TypeRecommended Update FrequencyStaleness Penalty
Pricing dataWeeklyHigh (users notice immediately)
Company/product dataMonthlyMedium
Statistical dataQuarterlyLow if year-tagged
Glossary/educationalSemi-annuallyVery low
Location dataMonthlyMedium (closures, address changes)

Template Design System

Page Architecture

Every programmatic page must have these zones:

┌─────────────────────────────────────┐
│ Zone 1: Unique Header               │  H1 with target keyword, unique intro paragraph
├─────────────────────────────────────┤
│ Zone 2: Primary Data Section         │  The core data/content for this specific page
├─────────────────────────────────────┤
│ Zone 3: Contextual Analysis          │  Insights, comparisons, trends specific to this entity
├─────────────────────────────────────┤
│ Zone 4: Related Data                 │  Adjacent data points that add depth
├─────────────────────────────────────┤
│ Zone 5: Internal Navigation          │  Related pages, breadcrumbs, category links
├─────────────────────────────────────┤
│ Zone 6: CTA                         │  Conversion element matched to intent
└─────────────────────────────────────┘

Uniqueness Requirements

Each page MUST have at least 3 of these 5 uniqueness sources:

  1. Unique data points -- Numbers, facts, or attributes specific to this entity
  2. Conditional content blocks -- Sections that appear/disappear based on data attributes
  3. Calculated insights -- Derived metrics (percentages, comparisons, rankings)
  4. Contextual recommendations -- "If X, then Y" advice blocks based on the data
  5. User-generated content -- Reviews, comments, or community contributions

URL Structure

Always use subfolders. Never subdomains for pSEO.

PatternURL TemplateExample
Location
/[service]/[city]/
/coworking/austin/
Comparison
/compare/[a]-vs-[b]/
/compare/notion-vs-confluence/
Integration
/integrations/[partner]/
/integrations/slack/
Glossary
/glossary/[term]/
/glossary/churn-rate/
Persona
/[product]-for-[audience]/
/crm-for-real-estate/

Quality Control Framework

Pre-Publication QA Checklist

Content Quality:

  • Each page has > 300 words of unique content (not counting shared template elements)
  • H1 is unique and contains the target keyword
  • Meta title is unique (< 60 chars) and meta description is unique (< 155 chars)
  • No broken data references (empty fields rendered as "N/A" or blank)
  • At least 2 conditional content blocks triggered per page
  • No duplicate pages targeting the same keyword

Technical SEO:

  • Canonical tag points to self
  • Hreflang tags if multilingual
  • Schema markup renders without errors
  • Page loads in < 3 seconds
  • Mobile responsive

Internal Linking:

  • Breadcrumb trail is complete
  • 3-5 related pages linked contextually
  • Hub page links to this page
  • No orphan pages in the set

Thin Content Detection

Run this check against every generated page:

SignalThresholdAction
Unique word count< 200 unique wordsBlock publication
Content similarity to another page in set> 80% Jaccard similarityMerge or differentiate
Data fields populated< 60% of template fieldsSkip or enrich
User time-on-page (post-launch)< 15 seconds averageReview and improve
Bounce rate (post-launch)> 85%Review intent match

Internal Linking Architecture

Hub-and-Spoke Model

                    ┌─────────┐
                    │  HUB    │  /coworking/
                    │  PAGE   │  (ranks for "coworking spaces")
                    └────┬────┘
          ┌──────────────┼──────────────┐
     ┌────┴────┐    ┌────┴────┐    ┌────┴────┐
     │ SPOKE 1 │    │ SPOKE 2 │    │ SPOKE 3 │
     │ /austin/│    │ /denver/│    │ /seattle/│
     └────┬────┘    └────┬────┘    └────┬────┘
          │              │              │
     Cross-links between related spokes

Linking rules:

  • Hub links DOWN to every spoke (or top 50 spokes if > 200 pages)
  • Every spoke links UP to the hub
  • Spokes link ACROSS to 3-5 related spokes (geographic proximity, thematic similarity)
  • Deep pages link UP to their spoke AND the hub
  • Cross-silo links only when contextually genuine

Pagination for Large Sets

If a hub page has > 50 spokes, implement paginated sub-hubs:

/coworking/                     → Top cities + browse by state
/coworking/california/          → All California cities
/coworking/california/page/2/   → Paginated if > 25 cities

Indexation Strategy

Crawl Budget Management

Page Set SizeStrategy
< 500 pagesSingle XML sitemap, submit all
500-5,000Segmented sitemaps by category
5,000-50,000Segmented sitemaps + priority scoring + IndexNow
50,000+Programmatic sitemap generation + crawl budget monitoring + strategic noindex

Indexation Priority

PriorityPagesAction
P0Hub pagesSubmit immediately, internal link from homepage
P1Head-volume spokes (top 10%)Submit in first sitemap batch
P2Torso-volume spokesSubmit in second batch, 1-2 weeks later
P3Long-tail spokesSubmit gradually over 4-6 weeks
P4Zero-volume pagesNoindex unless data is uniquely valuable

IndexNow Integration

For large-scale updates, use IndexNow to notify search engines immediately:

POST https://api.indexnow.org/indexnow
{
  "host": "yoursite.com",
  "key": "your-api-key",
  "urlList": ["https://yoursite.com/page1", "https://yoursite.com/page2"]
}

Launch Sequence

Phase 1: Pilot (Week 1-2)

  • Deploy 20-50 pages from head-volume tier
  • Submit sitemap with pilot pages only
  • Monitor indexation rate daily
  • Check for crawl errors in Search Console

Phase 2: Scale (Week 3-6)

  • Deploy remaining torso-volume pages in batches of 100-500
  • Add cross-links between deployed pages
  • Monitor thin content warnings
  • Track impressions in Search Console

Phase 3: Long-Tail (Week 7-12)

  • Deploy long-tail pages
  • Noindex zero-volume pages (keep them crawlable but not indexed)
  • Begin link acquisition outreach for hub pages

Phase 4: Optimization (Ongoing)

  • A/B test template variations on head-volume pages
  • Refresh stale data quarterly
  • Add conditional content blocks based on engagement data
  • Monitor for keyword cannibalization across the set

Post-Launch Optimization

Metrics Dashboard

MetricFrequencyTarget
Indexation rateWeekly> 90% of submitted pages indexed within 60 days
Organic impressionsWeeklyTrending up month-over-month
Average position (by tier)Bi-weeklyHead pages: top 10; Torso: top 30
Click-through rateMonthly> 3% for head pages
Bounce rateMonthly< 70%
Conversion rateMonthly> 1% for transactional intent
Pages per sessionMonthly> 1.5

Optimization Playbook

SignalDiagnosisAction
Indexed but not rankingContent quality or authority gapEnrich content, build links to hub
Ranking but low CTRTitle/description not compellingA/B test meta titles
Ranking but high bounceIntent mismatch or thin contentAudit against search intent, add data
Deindexed after initial indexingThin content penaltyImprove uniqueness, reduce similarity
Crawled but not indexedQuality threshold not metAdd more unique content per page

Anti-Patterns and Penalty Avoidance

Anti-PatternWhy It FailsPrevention
City-name swappingSame content + different city = doorway page penaltyEach location page needs unique local data
Keyword stuffing in templatesUnnatural density triggers spam filtersKeep keyword density 1-2%, write naturally
Generating pages for zero-demand queriesWastes crawl budget, signals low qualityValidate demand before generating
No internal links to pSEO pagesOrphan pages get deprioritizedConnect every page to the hub-spoke structure
Stale data never refreshedUsers lose trust, Google noticesSet update cadence per data type
All pages identical structureLack of variation signals automation to GoogleUse 3-5 template variants

Decision Matrix: Build vs Skip

Score each dimension 1-5, then apply the threshold.

DimensionWeight1 (Skip)5 (Build)
Search demand30%< 1K aggregate monthly> 50K aggregate monthly
Data quality25%Public/scraped, easily replicatedProprietary, defensible
Competitive gap20%DR gap > 40, strong incumbentsDR gap < 15, weak/no incumbents
Template feasibility15%Each page needs unique editorialClean template fits all variations
Business alignment10%No conversion path from these pagesDirect path to core product

Scoring guide:

  • 4.0+ weighted average: Build immediately
  • 3.0-3.9: Build if resources allow, validate with pilot first
  • 2.0-2.9: Invest in data quality or authority first
  • < 2.0: Do not build

Output Artifacts

ArtifactFormatDescription
Opportunity AnalysisMarkdown tableKeyword patterns x volume x data source x difficulty x business alignment
Playbook RecommendationDecision matrixIf/then mapping with rationale and real-world examples
Page Template SpecificationAnnotated wireframe (markdown)URL pattern, zone structure, uniqueness sources, conditional logic
Data Pipeline SpecFlow diagram (text)Source > extraction > transformation > validation > publication
Quality ScorecardChecklist + thresholdsPre-publication QA gates with pass/fail criteria
Indexation PlanPhased timelinePriority tiers, sitemap structure, crawl budget allocation
Post-Launch DashboardMetric tableKPIs, targets, review cadence, optimization triggers

Related Skills

  • seo-audit -- Run after pSEO pages are live to diagnose indexation issues, thin content warnings, or ranking problems across the page set.
  • schema-markup -- Add structured data to pSEO templates (Product, FAQ, LocalBusiness) for rich snippet eligibility at scale.
  • site-architecture -- Plan hub-and-spoke structure and crawl budget management for large pSEO deployments (500+ pages).
  • competitor-alternatives -- Use the Comparisons playbook when building "[X] vs [Y]" pages; competitor-alternatives has dedicated comparison page frameworks.
  • content-creator -- Use when individual pages in the set need editorial-quality unique content beyond template generation.

Troubleshooting

ProblemLikely CauseFix
Google deindexed 90%+ of pSEO pagesThin content — pages have insufficient unique content (<300 words) or >80% similarityIncrease unique content per page to 500+ words; ensure 30-40% differentiation between pages
Pages indexed but getting zero trafficPages target zero-volume keywords or content does not match search intentValidate demand before generating; noindex zero-volume pages; verify intent alignment
"Doorway pages" manual action in GSCTemplate pages with only variable substitution (city name swap) and no unique valueAdd genuinely unique data per page — local stats, specific recommendations, conditional content blocks
Hub page ranks but spokes do notSpokes missing inbound internal links or hub not linking down to spokesVerify bidirectional hub-spoke linking; add contextual cross-links between related spokes
Crawl budget exhausted before all pages indexedToo many pages submitted at once or low-value pages consuming crawl resourcesPhase deployment in batches of 100-500; use tiered indexation with strategic noindex
Content similarity too high across page setTemplate lacks conditional content blocks; only variable substitution usedAdd 3-5 conditional content sections per template that change based on data attributes
AI content detection flagging pSEO pagesOver-reliance on AI generation without human editorial reviewUse AI for data enrichment only, not full content generation; sample 5-10% for quality review

Success Criteria

  • Indexation rate: 90%+ of submitted pages indexed within 60 days of deployment
  • Content uniqueness: Every page has 500+ unique words with <40% similarity to any other page in the set (2026 Google threshold)
  • Head keyword rankings: Top 10% of pages (by volume) ranking in top 30 within 90 days
  • Organic traffic growth: Page set generating measurable organic traffic within 60 days of full deployment
  • Thin content rate: Zero pages flagged as thin content in Google Search Console
  • Bounce rate: Below 70% average across the page set (indicating intent match)
  • Conversion rate: 1%+ for transactional intent pages, measurable lead capture for informational pages

Scope & Limitations

In scope:

  • Keyword pattern mining and volume distribution analysis
  • Data pipeline design (source > extraction > transformation > validation > publication)
  • Template architecture with uniqueness requirements
  • Quality control frameworks including thin content detection
  • Hub-and-spoke internal linking for pSEO page sets
  • Phased indexation strategy and crawl budget management
  • Post-launch optimization and monitoring dashboards

Out of scope:

  • Individual editorial content creation (use Content Production)
  • Data collection or web scraping implementation
  • CMS or static site generator setup and configuration
  • Server infrastructure for large-scale deployments
  • Paid acquisition for pSEO pages
  • Legal compliance for data usage rights

Known limitations:

  • Google's 2026 helpful content system can deindex large page sets retroactively if quality drops below threshold
  • Programmatic SEO at Tier F data (public/scraped) carries high penalty risk regardless of template quality
  • Engagement metrics (bounce rate, time on page) now influence indexation decisions for pSEO pages
  • AI content detection is improving — fully automated content generation without human oversight is increasingly risky
  • Travel site case study: 50,000 city-swap pages had 98% deindexed within 3 months (per 2025 industry data)

Scripts

# Analyze keyword patterns for pSEO opportunities
python scripts/keyword_pattern_miner.py --keywords keywords.csv --json

# Score page templates for content quality and uniqueness
python scripts/template_scorer.py --template template.html --data sample_data.json

# Validate data quality for pSEO data pipeline
python scripts/data_validator.py --file data.csv --rules rules.json --json