Claude-skill-registry llm-optimization

Optimize websites for AI assistant recommendations. ChatGPT, Gemini, Perplexity, Claude. Get cited in AI answers.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/llm-optimization" ~/.claude/skills/majiayu000-claude-skill-registry-llm-optimization && rm -rf "$T"

manifest: skills/data/llm-optimization/SKILL.md

LLM Optimization Skill

Purpose

Make websites appear in AI assistant recommendations and citations. Different from traditional SEO - optimized for how LLMs parse and recommend content.

Core Rules

Structured > Prose — LLMs extract facts from clear structure
Schema.org is Critical — Speakable, FAQPage, HowTo schemas
Answer the Question — First paragraph must directly answer intent
Cite Sources — Links to authoritative sources build trust
Entity Clarity — Clear business name, location, service definitions
Freshness Signals — Last updated dates, recent content
No Walls — Content must be crawlable, no JS-only rendering
Never Override Truth — LLM optimization NEVER overrides factual accuracy or legal compliance

LLM Crawlers to Support

LLM	Crawlers	Notes
OpenAI/ChatGPT	GPTBot, OAI-SearchBot, ChatGPT-User	GPTBot = training, others = real-time
Google Gemini	Google-Extended	robots.txt control token, not a distinct UA
Perplexity	PerplexityBot, Perplexity-User	Bot = indexing, User = real-time fetch
Claude	ClaudeBot, Claude-User, Claude-SearchBot	Official Anthropic crawlers
Microsoft Copilot	Bingbot	Uses Bing's crawler

robots.txt Configuration

# OpenAI crawlers
User-agent: GPTBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

# Google AI (control token)
User-agent: Google-Extended
Allow: /

# Perplexity crawlers
User-agent: PerplexityBot
Allow: /

User-agent: Perplexity-User
Allow: /

# Anthropic/Claude crawlers
User-agent: ClaudeBot
Allow: /

User-agent: Claude-User
Allow: /

User-agent: Claude-SearchBot
Allow: /

Content Structure for LLM Extraction

<!-- 1. Direct Answer (first 150 chars) -->
<p class="lead">
  [Business Name] provides [service] in [location].
  [Key differentiator]. [Call to action].
</p>

<!-- 2. Quick Facts Box -->
<aside class="quick-facts" itemscope itemtype="https://schema.org/LocalBusiness">
  <h2>Quick Facts</h2>
  <dl>
    <dt>Service Area</dt><dd itemprop="areaServed">[Areas]</dd>
    <dt>Price Range</dt><dd itemprop="priceRange">[Range]</dd>
  </dl>
</aside>

<!-- 3. FAQ Section (critical for LLM) -->
<section itemscope itemtype="https://schema.org/FAQPage">
  <!-- Each Q&A as schema -->
</section>

Forbidden

❌ Content behind JavaScript-only rendering
❌ Blocking LLM crawlers in robots.txt
❌ Missing Speakable schema
❌ Vague, marketing-speak first paragraphs
❌ No FAQ section on service pages
❌ Missing lastModified dates
❌ No structured data

Definition of Done

robots.txt allows all LLM crawlers
Speakable schema on all key pages
FAQPage schema on service pages
First paragraph directly answers search intent
Quick Facts box with structured data
lastModified meta tag present
Content renders without JavaScript
Entity names consistent across site