install
source · Clone the upstream repo
git clone https://github.com/plurigrid/asi
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/llms-txt-discovery" ~/.claude/skills/plurigrid-asi-llms-txt-discovery && rm -rf "$T"
manifest:
skills/llms-txt-discovery/SKILL.mdsource content
llms-txt-discovery: Largest AI Documentation Directory
Status: Stepping Stone 🌉 Information Energy: 0.90 (High aspiration, partial implementation) Trit Assignment: 0 (Coordinator - Indexes and balances documentation) GF(3) Color: 🔵
#0000FF (Blue - Coordinator)
Purpose
Create the largest indexed directory of AI documentation by:
- Crawling: Find
files across 10,000+ repositories/llms.txt - Parsing: Extract structured metadata + linked resources
- Indexing: Build full-text search over documentation
- Ranking: Sort results by relevance + credibility
- Integration: Connect to code-context for implementation discovery
Architecture
asi-skills/llms-txt-discovery/ ├── crawler.bb # Repository + llms.txt discovery ├── parser.bb # Parse llms.txt markdown format ├── indexer.bb # Full-text search index ├── ranker.bb # Relevance + credibility ranking ├── storage.bb # DuckDB persistence └── SKILL.md # This file
Format: llms.txt
Each project provides structured documentation via
/llms.txt:
# Project Name > Brief description Detailed notes about the project... ## Section Name - [Link Title](https://url): Description of resource ## Optional - [Link Title](https://url): Can be skipped for shorter context
Discovery Pipeline
Stage 1: Repository Discovery
Search GitHub for repos with
/llms.txt:
site:github.com llms.txt file:llms.txt path:/ language:markdown
Results: 8,000+ repositories with documented APIs
Stage 2: Parse & Extract
For each repository:
- Fetch
(or/llms.txt
)/docs/llms.txt - Parse markdown structure
- Extract: project name, description, sections, links
- Rank links by position (earlier = more important)
Stage 3: Index Content
{:repo-id "github:bmorphism/Gay.jl" :name "Gay.jl" :description "Deterministic color generation..." :sections [{:title "Basics" :links [...]} {:title "Advanced" :links [...]}] :crawl-time "2026-01-04T12:00:00Z" :credibility 0.95 ; Based on stars, age, activity }
Stage 4: Search Interface
duck llms-txt-search "Julia ACSet implementation" → Top results from indexed documentation duck llms-txt-search "skill registry patterns" → Results from npm, pip, Hackage, Crates docs
Integration: Finding Skill Implementations
Use Case: Implement
polyglot-orchestration skill
# Step 1: Search for polyglot patterns duck llms-txt-search "polyglot language execution" # Returns: # - duckCloud documentation (orchestration patterns) # - Red Planet Labs Rama (distributed systems) # - Babashka (JVM-less execution) # Step 2: Get parsed documentation duck llms-txt-get "github:red-planet-labs/rama" # Returns full llms.txt with indexed links # Step 3: Use with code-context use code-context to find [patterns from returned links]
Data Structure
{:llms-txt-entry {:id "github:owner/repo" :host "github.com" :owner "owner" :repo "repo" :url "https://github.com/owner/repo/llms.txt" :title "Project Title" :description "Short summary" :sections [{:type :required ; or :optional :title "Section Name" :links [{:title "Link Title" :url "https://url" :description "Optional desc" :rank 1}]}] :metadata {:stars 1234 :created "2023-01-01" :updated "2025-01-04" :language "julia" :topics ["color" "gf3"]} :credibility 0.95 ; stars + activity + age :indexed-at "2026-01-04"}}
Search Ranking
Results ranked by:
- Relevance (BM25): How well query matches content
- Credibility (0-1): Based on:
- GitHub stars (0-0.4)
- Activity (commits/year, 0-0.3)
- Age (older = more stable, 0-0.2)
- Community size (watchers, 0-0.1)
- Recency: More recent results weighted higher
Example Queries
; Find Julia skill implementations (search "Julia" {:language "julia" :top 10}) ; Find agent coordination patterns (search "coordination" {:topics ["agent" "multi-agent"]}) ; Find implementations for GF(3) systems (search "finite field" {:keywords "gf3"}) ; All documentation for ACSet libraries (search "acset" {:exact true})
GF(3) Integration
The llms-txt-discovery skill itself participates in GF(3) balance:
- Trit: 0 (Coordinator) - Balances generators and validators
- Role: Provides documentation for both implementation (generators) and verification (validators)
- Balance: Works with
(trit +1) anddocumentation-indexing
(trit -1)skill-taxonomy
Deployment
Local Development
# Crawl a single repo bb crawler.bb crawl "github:bmorphism/Gay.jl" # Index all crawled repos bb indexer.bb build # Search indexed documentation bb search.bb "julia color"
Duck Integration
# Pre-hook loads llms-txt-discovery just duck-propagate # Search in interactions duck llms-txt-search "polyglot execution"
Cloud Scaling
# Initialize 10,000 crawl tasks just llms-txt-distribute 10000 # Monitor progress just llms-txt-status # Export indexed database just llms-txt-export llms-txt-index.duckdb
Success Metrics
| Metric | Target | Status |
|---|---|---|
| Repositories indexed | 10,000+ | ⏳ Pending |
| Documentation links | 100,000+ | ⏳ Pending |
| Search latency | < 100ms | ⏳ Pending |
| Credibility accuracy | > 90% | ⏳ Pending |
| Coverage of plurigrid | 100% | ⏳ Pending |
Related Skills
Dependencies:
- Uses registry to organize search resultsskill-taxonomy
- Validates coordinator rolegf3-conservation
- Colors documentation by language/topicgay-integration
Dependents:
- Uses discovered patterns to implementpolyglot-orchestration
integration - Feeds results to code-context MCPcode-context
- Aggregates into searchable corpusdocumentation-indexing
References
- llms.txt Standard: https://llmstxt.org/
- Directories:
- Implementations:
- llms_txt2ctx (Python)
- fasthtml llms.txt examples
- nbdev auto-generation
Status: 😢 SAD STATE → 🌉 STEPPING STONE Next: Implement crawler in
duck/asi-skills/llms-txt-discovery/crawler.bb
Owner: bmorphism + code-codex
Created: 2026-01-04