Claude-skills featuring
git clone https://github.com/oaustegard/claude-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/oaustegard/claude-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/featuring" ~/.claude/skills/oaustegard-claude-skills-featuring && rm -rf "$T"
featuring/SKILL.mdFeaturing
Generate
_FEATURES.md files — top-down documentation of what a codebase does,
organized by feature/capability, anchored to specific source symbols.
tree-sitting tells you WHAT symbols exist. _FEATURES.md tells you WHY they exist and what they accomplish together.
For large codebases, the root
_FEATURES.md decomposes into sub-feature files
linked by capability area — not by folder structure. An agent starts at the root
and is drawn into sub-files only when working on a relevant area.
Dependency
Requires tree-sitting skill. Uses its engine for AST scanning.
uv venv /home/claude/.venv 2>/dev/null uv pip install tree-sitter-language-pack --python /home/claude/.venv/bin/python
For quick structural orientation before running gather.py, use tree-sitting's CLI:
TREESIT=/mnt/skills/user/tree-sitting/scripts/treesit.py # Complete tree, sparse detail — see the full shape /home/claude/.venv/bin/python $TREESIT /path/to/repo --depth=-1 --detail=sparse
Workflow: Multi-Pass Synthesis
Feature documentation is built in three passes. The overview is written LAST, after all features are understood — not first.
Pass 1: Orientation (quick scan)
/home/claude/.venv/bin/python /mnt/skills/user/featuring/scripts/gather.py /path/to/repo \ --skip tests,.github,node_modules --source-budget 8000
Read the gather output. Before writing anything, form a hypothesis:
"This codebase appears to be a [what it is] that provides [capability A], [capability B], and [capability C]."
Write this down as a DRAFT overview. It will be wrong or incomplete — that's fine. The point is to orient before diving into detail.
How to identify capability areas:
- What can a user/consumer DO with this? (commands, API endpoints, UI actions)
- What problems does it solve? (the WHY behind the code)
- What are the main workflows? (how features compose)
- What are the constraints/invariants? (rules the code enforces)
Pass 2: Detailed feature extraction
For each capability area identified in Pass 1:
- Gather the symbols that implement it (from gather output + targeted
)get_source() - Understand how they collaborate — the workflow
- Identify constraints and invariants
- Write the feature section
During this pass, you'll discover:
- Capabilities you missed in Pass 1
- Features that are more complex than expected (decomposition candidates)
- Features that are simpler than expected (merge candidates)
- Cross-cutting concerns that span multiple capability areas
Hierarchy decision (per feature, during this pass):
| Signal | Action |
|---|---|
| ≤6 key symbols, self-contained | Inline in root |
| >6 key symbols OR clear sub-capabilities | Own sub-file |
| Spans many files but is ONE capability | Inline (breadth ≠ complexity) |
| Has sub-features that are independently useful | Own sub-file |
| Is infrastructure (logging, DB layer) | Inline briefly, unless it IS the product |
Pass 3: Overview rewrite
NOW — after all features are documented — rewrite the overview. The Pass 1 draft was a hypothesis. Pass 3 replaces it with a proper progressive-disclosure overview that:
- States what the codebase is in one sentence
- Lists the top-level capability areas (3-8 items)
- For each area that has a sub-file: one sentence + link + "read when" guidance
- For inline features: just the list entry (detail is below in the same file)
This is the most important part. The overview IS the entry point for every agent session. It must be accurate, complete, and fast to scan.
_FEATURES.md Format
Root file
# Features: {project-name} > One-sentence description of what this codebase is and does. **Capability areas:** - **[Area A]** — one-sentence summary - **[Area B]** — one-sentence summary → [details](path/to/_FEATURES.md) - **[Area C]** — one-sentence summary ## {Inline Feature Name} {2-3 sentences: what this feature does from a user perspective.} **Key symbols:** - `file.py#function_name` — role in this feature - `file.py#ClassName` — role in this feature **Workflow:** {How a user exercises this feature or how symbols collaborate.} **Constraints:** {Invariants, limits, rules.} --- ## {Complex Feature Area} > One-sentence summary of what this area covers. This area is documented in detail in [{area-name}/_FEATURES.md]({path}). Read it when working on {specific trigger — e.g., "the memory retrieval pipeline", "adding a new API endpoint", "modifying the build system"}. At a glance, this area provides: - {sub-capability 1} — one line - {sub-capability 2} — one line - {sub-capability 3} — one line
Sub-feature files
Sub-feature files follow the SAME format as the root, recursively. They can contain inline features and further sub-file references. Each sub-file:
- Has its own
header# Features: {area-name} - Has its own overview paragraph
- Is self-contained — an agent reading only this file understands the area
- Links back to the root:
← [Root features](../_FEATURES.md)
Format rules
- Organized by capability, not by file/directory
- Symbol references use
notation (relative to repo root)file#symbol - Leading paragraph per feature: what a user gets, not implementation details
- Key symbols: the 2-6 most important symbols, with their role explained
- Workflow: how the feature works end-to-end (include when non-obvious)
- Constraints: rules/invariants (include when they exist)
- No source code in _FEATURES.md — it's a map, not a mirror
- "Read when" guidance on every sub-file link — tells agents WHEN to drill in
What makes a good feature entry
Good: "Memory Storage — Persist observations across sessions. Stores typed, tagged memories to a Turso database with BM25 full-text search. Memories have priority levels that affect retrieval ranking."
Bad: "memory.py — Contains
remember(), recall(), forget(), and
supersede() functions."
The first tells you WHAT you can do. The second describes file contents — tree-sitting already gives you that.
Hierarchy design principles
The hierarchy is feature-driven, not folder-driven. Folders are natural candidates for decomposition boundaries, but the decision is based on:
- Does this capability area have enough complexity to warrant its own file? (>6 key symbols, multiple sub-workflows, or independently useful sub-features)
- Would an agent working on this area benefit from focused context? (if yes, a sub-file saves them from parsing unrelated features)
- Is this area likely to be read independently of the rest? (if yes, it should be self-contained in its own file)
Counter-examples — do NOT split just because:
- The code lives in a separate folder (folder ≠ feature)
- There are many files (files ≠ complexity)
- A class has many methods (one class = one feature unless methods serve distinct user-facing purposes)
Identifying features
Heuristics for finding feature boundaries:
- Entry points (main, CLI commands, route handlers) often map 1:1 to features
- Public API functions that aren't helpers are usually feature surfaces
- Type hierarchies (class + methods) often represent a cohesive feature
- Config/constants clusters sometimes reveal features (e.g., a group of timeout constants → a retry feature)
- Import clusters — files that import each other heavily are likely co-implementing a feature
Features to SKIP in _FEATURES.md:
- Pure infrastructure (logging, error handling) unless it's the project's purpose
- Internal utilities that only serve other features
- Test code (unless the testing approach IS a feature, e.g., a testing framework)
Keeping _FEATURES.md in Sync
Three mechanisms, layered:
1. Check script (detect drift)
/home/claude/.venv/bin/python /mnt/skills/user/featuring/scripts/check.py /path/to/repo \ [--features _FEATURES.md] [--skip tests,.github]
Parses
file#symbol references from ALL _FEATURES.md files (root + sub-files),
resolves them against the live codebase via tree-sitting, and reports:
- Broken refs — symbol deleted or renamed (exit code 1)
- Moved symbols — symbol exists but in a different file than referenced
- Dead features — ALL key symbols in a feature section are gone
- Uncovered symbols — new public API not mentioned in any feature
- Orphan sub-files — sub-feature files not linked from any parent
Exit code 0 = clean, 1 = drift detected. Suitable for CI or pre-commit hooks.
2. Agent instructions (prevent drift)
Add to CLAUDE.md or equivalent:
## Feature Documentation - `_FEATURES.md` documents what this codebase does, organized by capability. - Start here when orienting to the codebase. Follow sub-file links as needed. - After changing behavior (new feature, renamed API, deleted functionality): run `python featuring/scripts/check.py .` and fix any broken refs. - After adding a new public API surface: add it to the appropriate feature section, or create a new feature section if it's a new capability. - Run check before committing. Broken refs = broken documentation.
3. Targeted regeneration (fix drift)
When check reports broken refs, the fix is usually surgical: update the
file#symbol reference to the new name/location. For dead features (all refs
gone), either delete the section or regenerate it.
Full regeneration (re-running all three passes) is the nuclear option. Prefer targeted updates — they're cheaper and preserve hand-written narrative.
CI Integration
# .github/workflows/features-check.yml name: Check _FEATURES.md on: [push, pull_request] jobs: check: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: astral-sh/setup-uv@v4 - run: uv pip install tree-sitter-language-pack - run: python featuring/scripts/check.py . --skip tests
Claude Code Integration
In Claude Code, use tree-sitting's CLI or engine directly. The agent should:
- Run
for full structural overviewtreesit.py /path --depth=-1 --detail=sparse - Pass 1: Form a hypothesis about what the codebase does
- Run
for each capability areatreesit.py /path --path=DIR --detail=full - Run
where intent isn't cleartreesit.py /path --no-tree 'source:symbol_name' - Pass 2: Write detailed feature sections, deciding hierarchy per-feature
- Pass 3: Rewrite the overview now that all features are documented
Add to CLAUDE.md:
## Codebase Understanding Read `_FEATURES.md` for top-down feature orientation before modifying code. Follow links to sub-feature files when working on a specific area. Use tree-sitting MCP tools for structural queries (symbol lookup, source retrieval). After adding new features or changing behavior, update the relevant _FEATURES.md.
Example: Small Codebase (flat)
A CLI tool with 15 public symbols → single
_FEATURES.md, all features inline.
No sub-files needed.
Example: Large Codebase (hierarchical)
The
remembering skill (memory system for an AI agent) has ~60 public symbols
across 8 files. Hierarchical decomposition:
_FEATURES.md (root — overview + 3 inline features + 3 sub-file refs) ├── scripts/_FEATURES.md (memory operations — storage, retrieval, lifecycle, maintenance) └── utils/_FEATURES.md (utility modules — therapy, reminders, blog publishing)
Root
_FEATURES.md would contain:
- Overview: "Persistent memory system for Muninn. Stores, retrieves, and maintains typed memories across sessions via Turso."
- Inline: Boot Sequence, Configuration, Task Tracking (simple, ≤4 symbols each)
- Sub-file ref: Memory Operations →
("Read when working on storage, retrieval, or memory lifecycle")scripts/_FEATURES.md - Sub-file ref: Utility Modules →
("Read when working on therapy sessions, reminders, or blog publishing")utils/_FEATURES.md
Relationship to Other Skills
| Skill | What it provides | Drift detection |
|---|---|---|
| tree-sitting | Structural inventory (symbols, signatures) | N/A (live queries) |
| featuring | Feature documentation (what/why), hierarchical | — docs → code |
| generating-lattice | Bidirectional knowledge graph | — docs ↔ code |
| mapping-webapp | Web app behavioral docs (pages, flows) | None |
featuring's check is lighter than lattice's: no source code annotations needed, no
@lat: comments, just reference resolution. The trade-off is that new code
without docs is only flagged as "uncovered symbols" — it's advisory, not
enforced. Use lattice when you need strict bidirectional traceability; use
featuring when you need good-enough orientation docs that catch renames and
deletions.