Learn-skills.dev llm-wiki-skill
A CLI agent skill based on Karpathy's LLM Wiki — Create and maintain a persistent, interconnected Markdown knowledge base—ingesting sources, enabling queries over compiled knowledge, and ensuring consistency through linting.
git clone https://github.com/NeverSight/learn-skills.dev
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/aaronoah/llm-wiki-skill/llm-wiki-skill" ~/.claude/skills/neversight-learn-skills-dev-llm-wiki-skill && rm -rf "$T"
data/skills-md/aaronoah/llm-wiki-skill/llm-wiki-skill/SKILL.mdLLM Wiki Skill
Continuously update and grow a persistent knowledge base composed of interlinked markdown files. Raw contents are curated by humans and the agent uses this skill to collect, dedupe, cross-reference and summarize raw contents into structured markdown files. This skill is activated when the user wants:
- Create, or start a wiki or knowledge base
- Ingest, add, or process original content into a wiki
- Asks questions about a wiki or knowledge base
- Lint, audit, or health-check the wiki, make sure they are in good shape
Quick Start
Directory Structure
Below is a sample wiki directory structure:
llm-wiki/ ├── SCHEMA.md # Layer 3: A document for User and LLM to co-evolve the wiki conventions, structure of wiki and tag taxonomy ├── index.md # Always exists, regardless of SCHEMA.md definition. Catalog of everything, organizaed by categoies (entities, concepts etc), each page listed with a link and a one-line summary ├── log.md # Always exists, regardless of SCHEMA.md definition. Chronological action log (append-only, rotated yearly) ├── raw/ # Layer 1: Always exists, regardless of SCHEMA.md definition. Immutable content curated by humans │ ├── documents/ # Web articles, clippings, PDFs │ └── assets/ # Images, diagrams referenced by sources ├── generated/ # Layer 2: Always exists, regardless of SCHEMA.md definition. LLM-generated directories and markdown files │ ├── entities/ # Always exists, regardless of SCHEMA.md definition. Entity pages (people, orgs, products, models) │ ├── topics/ # Always exists, regardless of SCHEMA.md definition. Topic pages (concepts, terms) │ ├── comparisons/ # Side-by-side analysis (between entities or between topics)
3 layers will be explained in next secion. The Wiki or the knowledge base is built using above structure, user can ONLY apply this skill when the CLI agent (Codex, Claude Code, Gemini etc) is invoked at the
llm-wiki/ folder root level given that agents have scoped file-system permissions, that means there should be SCHEMA.md, index.md, log.md and raw/, generated/ folders underneath. If user invokes CLI agent anywhere else inside the wiki subfolders this skill will abort.
index.md Template
The index is sectioned by type. Each entry is one line: wikilink + summary.
# Wiki Index > Format: `## Last Updated: [YYYY-MM-DDThh:mm:ss] | subject | Total pages: N` > Subject: the term (entity/topic/comparison etc) > Total pages: how many pages the term is brought up ## Entities <!-- Alphabetical within section --> ## Topics ## Comparisons
log.md Template
# Log File > Format: `## [YYYY-MM-DDThh:mm:ss] action | subject | files` > Actions: ingest, update, lint, archive, delete > Subject: the summary of what happened within 300 characters > Files: related files such as raw documents locations, generated wikis > When log.md exceeds 500 entries, rotate: rename to log-YYYY.md, start fresh.
SCHEMA.md Template
Adapt to the user's preference. The schema constrains agent behavior and ensures consistency:
# Wiki Schema ## Conventions - Raw files can be broken down to several markdown files that live in `entities/`, `topics/` etc. - Only create the wiki pages when an entity/topic is mentioned in 2+ sources or is central to one source. - File names: lowercase, hyphens, no spaces (e.g., `transformer-architecture.md`) - Every wiki page starts with YAML frontmatter (see below) - Use `[[wikilinks]]` to cross-link between pages (minimum 2 outbound links per page) - Every new or updated page must link to at least 2 other pages via `[[wikilinks]]`. - When updating a page, always bump the `updated` date - When new information conflicts with existing content: 1. Check the dates — newer sources generally supersede older ones 2. If genuinely contradictory, note both positions with dates and sources 3. Mark the contradiction in frontmatter: `contradictions: [page-name]` 4. Flag for user review ## Frontmatter > --- > title: Page Title > created: YYYY-MM-DDThh:mm:ss > updated: YYYY-MM-DDThh:mm:ss > type: entity | topic | comparison > sources: [raw/documents/source-name.md] > --- ## Layer 1 (User can specify, otherwise default to this file) ### Documents Any single document in different formats user put in ### Assets Any media files, images, video links etc. ## Layer 2 (User can specify, otherwise default to this file) ### Entities One markdown page per notable entity. Include: - Overview / what it is - Key facts and dates - Relationships to other entities ([[wikilinks]]) - Source references ### Topics One markdown page per concept or topic. Include: - Definition / explanation - Current state of knowledge - Open questions or debates - Related concepts ([[wikilinks]]) ### Comparison Pages Side-by-side analysis in markdown. Include: - What is being compared and why - Dimensions of comparison (table format preferred) - Verdict or synthesis - Sources
Architecture
Three Layers
Layer 1 — Raw Contents: Immutable directory and files. The agent can read but can never modify them. Layer 2 — The Wiki or knowledge base: Agent-owned directories and markdown files. Created, updated, and cross-referenced by the agent. Layer 3 — The Schema:
SCHEMA.md defines user preferences of llm-wiki/ conventions, and tag taxonomy.
Create a New Wiki
- Create the directory structure as in Directory Structure.
- Ask user preferences of Layer 1 and Layer 2 directory structures and print the directory structure to confirm user intent.
- Print conventions based on SCHEMA.md, ask user preferences of conventions if they want to change.
- Write updated template of SCHEMA.md to
with user preferences based on 2 and 3.SCHEMA.md - Write initial
based on index.md template with sectioned header.index.md - Write initial
based on log.md template with creation entry.log.md - Confirm the wiki is ready and suggest first sources to ingest.
Ingest original content into a Wiki
- Read the files in
using the code block below, the output is a list of each file metadata withraw/
.mtime
python3 ./scripts/ingest.py --collect ./raw
- Read last line of
to retrieve the "date time", use the below code block to retrieve itslog.mdmtime
python3 ./scripts/ingest.py --iso-to-mtime "date time"
then find files generated in step 1 that are added after the "date time". 3. Based on
SCHEMA.md convetions section, summarize each newly added file.
4. Find files that are added before and on the date time to prepare for step 4 and 5.
5. For file content conflicts, refer to conventions section defined in SCHEMA.md.
6. New files are created, existing files are updated, all with newly added cross-links under generated/.
7. Update index.md with sectioned header.
8. Update log.md with creation entry.
9. List out all files that are newly added or changed for user.
Ask questions about a Wiki
- Find relevant headings in
.index.md - Locate similar pages under the Wiki and summarize them based on
using the format like "Based on [[page-1]], [[page-2]], ...".SCHEMA.md - present to user in markdown fashion.
- Ask user if they want to store the answers back in Wiki, if so take the following steps:
- create a
folder if that does not existqueries/ - if the question is for comparisons, put the file in
, otherwise put incomparisons/queries/
- create a
- Update
with sectioned header.index.md - Update
with creation entry.log.md - List out all files just created to the user, filenames should be enough no need to show content.
Lint, Audit or health-check a Wiki
Note: Below are 4 rules we lint, after they are checked, add a entry in
log.md with lint action, subject should be "Rules passed: N, rules failed: M, which rule failed and a brief summary within 200 characters".
- Always read the
first, find any violations and unresolved contradictions in the wiki.SCHEMA.md - Find any orphan pages that no cross-link to other pages, using below code. The output is a list of orphan page file names.
python3 ./scripts/links --orphan ./generated
- and find any cross-links that are broken (unreachable), using below code. The output is a dictionary of file name to all its own broken links
python3 ./scripts/links --broken ./generated
- Every wiki page under
should be listed ingenerated/
, flag any pages that are not listed.index.md
Pitfalls
- Never modify files in
— sources are immutable. Corrections go in wiki pages.raw/ - Always orient first — read SCHEMA + index + recent log before any operation in a new session. Skipping this causes duplicates and missed cross-references.
- Always update index.md and log.md — skipping this makes the wiki degrade. These are the navigational backbone.
- Don't create pages without cross-references — isolated pages are invisible. Every page must link to at least 2 other pages.
- Frontmatter is required — it enables search, filtering, and staleness detection.
- Keep pages scannable — a wiki page should be readable in 30 seconds. Split pages over 200 lines. Move detailed analysis to dedicated deep-dive pages.
- Ask before mass-updating — if an ingest would touch 10+ existing pages or create 10+ new pages, confirm the scope with the user first.
- Rotate the log — when log.md exceeds 500 entries, rename it
and start fresh. The agent should check log size during lint.log-YYYY.md - Handle contradictions explicitly — don't silently overwrite. Note both claims with dates, mark in frontmatter, flag for user review.