Claude-skill-registry doc-search
Token-efficient documentation search using Serena Document Index. 90%+ token savings vs reading full files. Use BEFORE reading README.md or docs/ files. Triggers on architecture questions, pattern lookups, and project-specific documentation needs.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/doc-search" ~/.claude/skills/majiayu000-claude-skill-registry-doc-search && rm -rf "$T"
manifest:
skills/data/doc-search/SKILL.mdsource content
Document Search
Search project documentation efficiently using the Serena Document Index system.
Why This Matters
| Approach | Tokens | Use Case |
|---|---|---|
| Read full README.md | 3000-8000 | Never (wasteful) |
| Read docs/*.md | 2000-5000 each | Rarely needed |
| Document Index Search | 100-500 | Always prefer |
| Section Retrieval | 200-800 | After finding relevant section |
Rule: Never read documentation files until the document index fails to answer.
Document Index Location
.serena/cache/documents/document_index.json
Index Types Available:
- Search by tags (architecture, api, testing, etc.)tag_index
- Search by section titlestitle_index
- Filter by project (basecamp-server, interface-cli, etc.)project_index
- Filter by document type (readme, guide, api-reference, etc.)doc_type_index
- Keyword-based content searchcontent_index
Workflow Pattern
Step 1: Search Document Index (Python CLI)
# Search for relevant documentation sections cd /Users/kun/github/1ambda/dataops-platform python3 scripts/serena/document_indexer.py --search "hexagonal architecture" --max-results 5
Step 2: Read Specific Section Only
After finding relevant section from search:
# Use section coordinates from search result # Example: project-basecamp-server/docs/PATTERNS.md#module-placement-rules # Read only that section (lines 45-80) instead of entire file Read(file_path="project-basecamp-server/docs/PATTERNS.md", offset=45, limit=35)
Step 3: Alternative - Direct JSON Query
# For programmatic access in agent workflows import json from pathlib import Path cache_path = Path(".serena/cache/documents/document_index.json") index = json.loads(cache_path.read_text()) # Search by tag architecture_docs = index['tag_index'].get('architecture', []) # Search by project server_docs = index['project_index'].get('project-basecamp-server', []) # Get section content for ref in architecture_docs[:3]: print(f"Section: {ref['section_title']}") print(f"File: {ref['relative_path']}") print(f"Lines: {ref['line_start']}-{ref['line_end']}")
Decision Tree
Need documentation? | +-- What patterns exist for X? | +-- doc-search: tag_index["patterns"] or tag_index["architecture"] | +-- How to implement feature in project Y? | +-- doc-search: project_index["project-Y"] + tag_index["implementation"] | +-- What does README say about Z? | +-- doc-search: title_index["Z"] or content_index["keyword"] | +-- Full context needed? +-- Read specific section (lines from search result) +-- LAST RESORT: Read full file
Integration with mcp-efficiency
Document search is the first step before Serena symbol queries:
# 1. Search docs for patterns/context doc_search("hexagonal architecture", max_results=3) # 2. Use Serena for code structure serena.get_symbols_overview("module-core-domain/") # 3. Find specific symbols serena.find_symbol("RepositoryJpa", depth=1)
Common Search Queries
| Need | Search Query |
|---|---|
| Architecture patterns | |
| API endpoints | |
| Testing patterns | |
| Entity relationships | |
| CLI commands | |
| Configuration | |
Token Savings Examples
| Task | Without Doc Search | With Doc Search | Savings |
|---|---|---|---|
| Find architecture pattern | 5000 tokens (full PATTERNS.md) | 300 tokens | 94% |
| Check entity rules | 3000 tokens (full README) | 400 tokens | 87% |
| Find API reference | 4000 tokens (full docs) | 250 tokens | 94% |
| Implementation guide | 6000 tokens (multiple files) | 500 tokens | 92% |
Updating the Index
# Rebuild after documentation changes python3 scripts/serena/update-symbols.py --with-docs # Incremental update (changed files only) python3 scripts/serena/update-symbols.py --changed-only --with-docs # Full rebuild python3 scripts/serena/document_indexer.py --project-root . --rebuild
Anti-Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| Read full README.md first | 3000+ tokens wasted | Search index, read section |
| Read all docs/*.md | 10000+ tokens wasted | Search by tag/title |
| Skip doc search, use web | Slower, less relevant | Use indexed local docs |
| Guess file locations | Miss relevant docs | Use project_index filter |
Quick Reference
# CLI search (recommended) python3 scripts/serena/document_indexer.py --search "QUERY" --max-results 5 # Build/rebuild index python3 scripts/serena/update-symbols.py --with-docs # Check index stats python3 -c "import json; d=json.load(open('.serena/cache/documents/document_index.json')); print(d['metadata'])"