Claude-obsidian wiki-lint
git clone https://github.com/AgriciDaniel/claude-obsidian
T=$(mktemp -d) && git clone --depth=1 https://github.com/AgriciDaniel/claude-obsidian "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/wiki-lint" ~/.claude/skills/agricidaniel-claude-obsidian-wiki-lint && rm -rf "$T"
skills/wiki-lint/SKILL.mdwiki-lint: Wiki Health Check
Run lint after every 10-15 ingests, or weekly. Ask before auto-fixing anything. Output a lint report to
wiki/meta/lint-report-YYYY-MM-DD.md.
Lint Checks
Work through these in order:
- Orphan pages. Wiki pages with no inbound wikilinks. They exist but nothing points to them.
- Dead links. Wikilinks that reference a page that does not exist.
- Stale claims. Assertions on older pages that newer sources have contradicted or updated.
- Missing pages. Concepts or entities mentioned in multiple pages but lacking their own page.
- Missing cross-references. Entities mentioned in a page but not linked.
- Frontmatter gaps. Pages missing required fields (type, status, created, updated, tags).
- Empty sections. Headings with no content underneath.
- Stale index entries. Items in
pointing to renamed or deleted pages.wiki/index.md - Address validity (DragonScale Mechanism 2). For every page that has an
frontmatter field, validate the format. See the Address Validation section below.address: - Semantic tiling (DragonScale Mechanism 3, opt-in). Flag candidate duplicate pages (across all scanned types, not just concepts) via embedding cosine similarity. See the Semantic Tiling section below.
Lint Report Format
Create at
wiki/meta/lint-report-YYYY-MM-DD.md:
--- type: meta title: "Lint Report YYYY-MM-DD" created: YYYY-MM-DD updated: YYYY-MM-DD tags: [meta, lint] status: developing --- # Lint Report: YYYY-MM-DD ## Summary - Pages scanned: N - Issues found: N - Auto-fixed: N - Needs review: N ## Orphan Pages - [[Page Name]]: no inbound links. Suggest: link from [[Related Page]] or delete. ## Dead Links - [[Missing Page]]: referenced in [[Source Page]] but does not exist. Suggest: create stub or remove link. ## Missing Pages - "concept name": mentioned in [[Page A]], [[Page B]], [[Page C]]. Suggest: create a concept page. ## Frontmatter Gaps - [[Page Name]]: missing fields: status, tags ## Stale Claims - [[Page Name]]: claim "X" may conflict with newer source [[Newer Source]]. ## Cross-Reference Gaps - [[Entity Name]] mentioned in [[Page A]] without a wikilink.
Naming Conventions
Enforce these during lint:
| Element | Convention | Example |
|---|---|---|
| Filenames | Title Case with spaces | |
| Folders | lowercase with dashes | |
| Tags | lowercase, hierarchical | |
| Wikilinks | match filename exactly | |
Filenames must be unique across the vault. Wikilinks work without paths only if filenames are unique.
Writing Style Check
During lint, flag pages that violate the style guide:
- Not declarative present tense ("X basically does Y" instead of "X does Y")
- Missing source citations where claims are made
- Uncertainty not flagged with
> [!gap] - Contradictions not flagged with
> [!contradiction]
Dataview Dashboard
Create or update
wiki/meta/dashboard.md with these queries:
--- type: meta title: "Dashboard" updated: YYYY-MM-DD --- # Wiki Dashboard ## Recent Activity ```dataview TABLE type, status, updated FROM "wiki" SORT updated DESC LIMIT 15 ``` ## Seed Pages (Need Development) ```dataview LIST FROM "wiki" WHERE status = "seed" SORT updated ASC ``` ## Entities Missing Sources ```dataview LIST FROM "wiki/entities" WHERE !sources OR length(sources) = 0 ``` ## Open Questions ```dataview LIST FROM "wiki/questions" WHERE answer_quality = "draft" SORT created DESC ```
Canvas Map
Create or update
wiki/meta/overview.canvas for a visual domain map:
{ "nodes": [ { "id": "1", "type": "file", "file": "wiki/overview.md", "x": 0, "y": 0, "width": 300, "height": 140, "color": "1" } ], "edges": [] }
Add one node per domain page. Connect domains that have significant cross-references. Colors map to the CSS scheme: 1=blue, 2=purple, 3=yellow, 4=orange, 5=green, 6=red.
Address Validation (DragonScale Mechanism 2 MVP)
Opt-in feature. Address Validation runs only if the vault is using DragonScale, detected by:
if [ -x ./scripts/allocate-address.sh ] && [ -f ./.vault-meta/address-counter.txt ]; then DRAGONSCALE_ADDRESSES=1 else DRAGONSCALE_ADDRESSES=0 fi
When
DRAGONSCALE_ADDRESSES=0, skip this entire section. Missing address: fields are not flagged, not even informationally. Pages that happen to have an address: field are passed through unvalidated (treat as user-managed metadata).
When
DRAGONSCALE_ADDRESSES=1, proceed with the rollout baseline and checks below.
Rollout baseline: 2026-04-23 (Phase 2 ship date in vaults that adopted DragonScale on that day). Vaults that adopted DragonScale later should override this baseline by setting the earliest
created: date of any addressed page as their personal rollout date. Record the chosen baseline at the top of .vault-meta/legacy-pages.txt as a commented line: # rollout: YYYY-MM-DD.
Classification rule (applied per page)
Before validating anything, classify the page:
| Classification | Criteria |
|---|---|
| Meta / fold / excluded | File is in OR filename in . Address not required. |
| Post-rollout (must have address) | is not meta/fold AND frontmatter date is >= 2026-04-23 AND file path is NOT in the legacy baseline manifest. |
| Legacy (backfill-eligible) | is not meta/fold AND frontmatter date is < 2026-04-23 OR file path IS in the legacy baseline manifest. Address not required until backfill. |
Legacy baseline manifest: optional file at
.vault-meta/legacy-pages.txt, one relative path per line. Pages listed there are treated as legacy regardless of created: date. Use this to grandfather pages whose created: metadata is wrong or missing.
Validation checks (run in order)
-
Format check: any page with
set must match one of:address:
— post-rollout creation address.^c-[0-9]{6}$
— legacy-backfill address.^l-[0-9]{6}$- Pages under
usewiki/folds/
, notfold_id
; do not apply theaddress
/c-
regex there.l-
-
Uniqueness check: no two pages share the same address value. Report both paths.
-
Counter consistency:
returns the next counter value. Every observed./scripts/allocate-address.sh --peek
must satisfyc-NNNNNN
. Violation = counter drift.NNNNNN < peek_value -
Post-rollout enforcement: every page classified as "post-rollout (must have address)" that LACKS the
field is a lint error, not informational. This prevents the silent-regression path where a new page skips address assignment.address: -
Legacy identification: every page classified as "legacy" that LACKS an address is informational. The lint report lists them under "Pending backfill" with total count.
-
Address-map consistency (
): for every page path in.raw/.manifest.json
, the page must exist and its frontmatteraddress_map
must match the mapping. Mismatches are errors (either a rename dropped the map update, or a manual edit diverged).address
Lint posture summary
- Pages that HAVE an address with bad format: error.
- Pages that HAVE colliding addresses: error.
- Pages classified post-rollout WITHOUT an address: error.
- Pages classified legacy WITHOUT an address: informational (expected).
- Meta and fold pages without
: ignored (not applicable).address - Counter drift (observed counter >= peek): error.
- Address-map mismatch: error.
Lint only observes. Do NOT auto-assign missing addresses during lint. Assignment is
wiki-ingest's responsibility only.
Output section in the lint report
## Address Validation - Counter state: `$(./scripts/allocate-address.sh --peek)` - Highest c- address observed: c-XXXXXX - Post-rollout pages checked: N (X passing, Y errors) - Legacy pages pending backfill: M ### Errors - [[Page Name]]: invalid address format `{value}`. Expected `c-NNNNNN` or `l-NNNNNN`. - [[Page A]] and [[Page B]] share address `c-000042`. - [[Post-Rollout Page]]: missing address. Page created 2026-04-25 (post-rollout); address required. Run wiki-ingest or manually run `./scripts/allocate-address.sh` and add to frontmatter. - [[Page Name]] has address `c-000100` but counter peek is `50`. Counter drift; run `./scripts/allocate-address.sh --rebuild`. - `.raw/.manifest.json` maps `wiki/foo.md` -> `c-000010` but page frontmatter has `c-000012`. Resolve mismatch. ### Pending backfill (informational) - M legacy pages without addresses. See `.vault-meta/legacy-pages.txt` for the canonical legacy set, or filter by `created:` < 2026-04-23.
Semantic Tiling (DragonScale Mechanism 3 MVP, opt-in)
Opt-in feature. Semantic tiling flags candidate duplicate pages (not just concept pages — see Scope below) using embedding cosine similarity. Local ollama only by default; remote endpoints require an explicit override flag.
Detection and delegation
if [ -x ./scripts/tiling-check.py ] && command -v python3 >/dev/null 2>&1; then ./scripts/tiling-check.py --peek > /tmp/tiling-peek.json 2>/dev/null PEEK_EXIT=$? case $PEEK_EXIT in 0) TILING_READY=1 ;; # ready 2) TILING_READY=0 ; echo "tiling ERROR: usage error (exit 2); inspect /tmp/tiling-peek.json" ;; 3) TILING_READY=0 ; echo "tiling ERROR: cache corrupt (exit 3); inspect .vault-meta/tiling-cache.json" ;; 4) TILING_READY=0 ; echo "tiling ERROR: vault exceeds scale hard-fail (exit 4); batching required" ;; 10) TILING_READY=0 ; echo "tiling skipped: ollama not reachable (exit 10)" ;; 11) TILING_READY=0 ; echo "tiling skipped: run 'ollama pull nomic-embed-text' to enable (exit 11)" ;; *) TILING_READY=0 ; echo "tiling ERROR: unexpected exit code $PEEK_EXIT from tiling-check.py --peek" ;; esac else TILING_READY=0 echo "tiling skipped: scripts/tiling-check.py or python3 not available" fi
Inspect
/tmp/tiling-peek.json (structured diagnostics: script path, python interpreter, ollama URL, cache state, thresholds state) whenever the status is ambiguous. Never collapse unknown exits into "unknown status" silently.
When
TILING_READY=1:
./scripts/tiling-check.py --report wiki/meta/tiling-report-YYYY-MM-DD.md REPORT_EXIT=$? case $REPORT_EXIT in 0) echo "tiling report written" ;; 2) echo "tiling ERROR: usage error during --report" ;; 3) echo "tiling ERROR: cache corrupt during --report" ;; 4) echo "tiling ERROR: scale hard-fail during --report" ;; 10) echo "tiling ERROR: ollama became unreachable between --peek and --report" ;; 11) echo "tiling ERROR: model became unavailable between --peek and --report" ;; *) echo "tiling ERROR: unexpected exit code $REPORT_EXIT from tiling-check.py --report" ;; esac
Scope (what the helper scans)
- Includes: every
under.md
except the exclusion set below. The scope is "candidate tileable pages," not justwiki/
.type: concept - Excludes (path): anything under
orwiki/folds/
.wiki/meta/ - Excludes (filename):
,_index.md
,index.md
,log.md
,hot.md
,overview.md
,dashboard.md
,Wiki Map.md
.getting-started.md - Excludes (frontmatter):
ortype: meta
.type: fold - Excludes (security): symlinks. Any page file that is a symlink, or whose resolved path escapes the vault root, is skipped.
If you place a real concept under
wiki/meta/ it will be excluded by path regardless of content. Keep concepts in their canonical folders.
How the helper works
- Computes one embedding per included page via the ollama
model by default.nomic-embed-text - Caches embeddings at
, keyed on.vault-meta/tiling-cache.json
so model drift auto-invalidates. Frontmatter is not part of the hash or the embedding input — pure frontmatter edits (tag changes, status bumps) do not trigger recomputation.sha256(model + body) - Orphans are GC'd: when a cached page path no longer exists on disk, its entry is dropped on save.
- Concurrent-safe: exclusive flock on
around cache I/O; per-PID temp file for atomic writes..vault-meta/.tiling.lock
Security posture
- Defaults to
.http://127.0.0.1:11434
env override is accepted only withOLLAMA_URL
because page bodies are POSTed as embedding input.--allow-remote-ollama - Symlinks and vault-root escapes are rejected.
Default bands (conservative seeds, NOT calibrated)
| Band | Similarity | Report section |
|---|---|---|
| Error | | Errors — strong near-duplicate, likely the same concept |
| Review | | Review — possible tile overlap; human judgement needed |
| Pass | | not emitted |
These values are conservative seeds, not literature-backed interpolation. Published reference points: Sentence Transformers
community_detection defaults to 0.75; Quora-duplicate calibrations land around 0.7715-0.8352 depending on objective. The 0.80 review floor is already stricter than at least one cited Quora optimum, so expect false negatives against those baselines. Reduce the review floor during calibration if you want more sensitivity.
Calibration procedure (manual, one-time per vault)
- Run the helper with defaults. Capture the Review band pairs.
- Temporarily lower
tobands.review
in0.70
to surface a wider sample. Aim for >=50 pairs spanning 0.70-0.95..vault-meta/tiling-thresholds.json - Label each pair:
,duplicate
,similar
.distinct - Pick bands such that: (a) the
band contains >= 95% true duplicates; (b) theerror
band capturesreview
pairs without swamping the report withsimilar
ones.distinct - Edit
: set new.vault-meta/tiling-thresholds.json
andbands.error
, setbands.review
, setcalibrated: true
to the label count.calibration_pairs_labeled - Re-run lint. Report footer now says
.calibrated: true
Scale
- Cold-cache cost is O(N) POSTs to ollama. Warm-cache cost is O(N^2) cosines in pure Python.
- Helper prints a warning at > 500 pages and hard-fails (exit 4) at > 5000. Revisit the implementation (batching, vectorized cosine, or external tooling) before exceeding either limit.
Lint report embed
## Semantic Tiling See [[tiling-report-YYYY-MM-DD]] for the full pair listing. - Errors (>=0.90): N pairs - Review (0.80-0.90): M pairs - Calibrated: true|false
Invariants
- Read-only.
never modifies wiki pages.tiling-check.py - No auto-merge. Duplicates are listed, never resolved.
- Cache is incremental and model-scoped. Unchanged pages are not re-embedded.
- Exit codes:
ok,0
usage error,2
cache corrupt,3
scale hard-fail,4
ollama unreachable,10
model missing. Surface all of them; do not collapse into a single "unknown" bucket.11
Before Auto-Fixing
Always show the lint report first. Ask: "Should I fix these automatically, or do you want to review each one?"
Safe to auto-fix:
- Adding missing frontmatter fields with placeholder values
- Creating stub pages for missing entities
- Adding wikilinks for unlinked mentions
Needs review before fixing:
- Deleting orphan pages (they might be intentionally isolated)
- Resolving contradictions (requires human judgment)
- Merging duplicate pages