Understand-Anything understand

Analyze a codebase to produce an interactive knowledge graph for understanding architecture, components, and relationships

install
source · Clone the upstream repo
git clone https://github.com/Lum1104/Understand-Anything
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Lum1104/Understand-Anything "$T" && mkdir -p ~/.claude/skills && cp -r "$T/understand-anything-plugin/skills/understand" ~/.claude/skills/lum1104-understand-anything-understand && rm -rf "$T"
manifest: understand-anything-plugin/skills/understand/SKILL.md
source content

/understand

Analyze the current codebase and produce a

knowledge-graph.json
file in
.understand-anything/
. This file powers the interactive dashboard for exploring the project's architecture.

Options

  • $ARGUMENTS
    may contain:
    • --full
      — Force a full rebuild, ignoring any existing graph
    • --auto-update
      — Enable automatic graph updates on commit (writes
      autoUpdate: true
      to
      .understand-anything/config.json
      )
    • --no-auto-update
      — Disable automatic graph updates (writes
      autoUpdate: false
      to
      .understand-anything/config.json
      )
    • --review
      — Run full LLM graph-reviewer instead of inline deterministic validation
    • A directory path (e.g.
      /path/to/repo
      or
      ../other-project
      ) — Analyze the given directory instead of the current working directory

Phase 0 — Pre-flight

Determine whether to run a full analysis or incremental update.

  1. Resolve

    PROJECT_ROOT
    :

    • Parse
      $ARGUMENTS
      for a non-flag token (any argument that does not start with
      --
      ). If found, treat it as the target directory path.
      • If the path is relative, resolve it against the current working directory.
      • Verify the resolved path exists and is a directory (run
        test -d <path>
        ). If it does not exist or is not a directory, report an error to the user and STOP.
      • Set
        PROJECT_ROOT
        to the resolved absolute path.
    • If no directory path argument is found, set
      PROJECT_ROOT
      to the current working directory. 1.5. Ensure the plugin is built. Later phases invoke Node scripts that import
      @understand-anything/core
      . On a fresh install
      packages/core/dist/
      does not exist yet — build once. This skill file lives at
      <PLUGIN_ROOT>/skills/understand/SKILL.md
      , so the plugin root is two directories above it.
    PLUGIN_ROOT="<two directories above this SKILL.md>"
    if [ ! -f "$PLUGIN_ROOT/packages/core/dist/index.js" ]; then
      cd "$PLUGIN_ROOT" && (pnpm install --frozen-lockfile 2>/dev/null || pnpm install) && pnpm --filter @understand-anything/core build
    fi
    

    If

    pnpm
    is missing, report to the user: "Install Node.js ≥ 22 and pnpm ≥ 10, then re-run
    /understand
    ."

  2. Get the current git commit hash:

    git rev-parse HEAD
    
  3. Create the intermediate and temp output directories:

    mkdir -p $PROJECT_ROOT/.understand-anything/intermediate
    mkdir -p $PROJECT_ROOT/.understand-anything/tmp
    

3.5. Auto-update configuration:

  • If
    --auto-update
    is in
    $ARGUMENTS
    : write
    {"autoUpdate": true}
    to
    $PROJECT_ROOT/.understand-anything/config.json
  • If
    --no-auto-update
    is in
    $ARGUMENTS
    : write
    {"autoUpdate": false}
    to
    $PROJECT_ROOT/.understand-anything/config.json
  • These flags only set the config — analysis proceeds normally regardless.
  1. Check for subdomain knowledge graphs to merge: List all

    *knowledge-graph*.json
    files in
    $PROJECT_ROOT/.understand-anything/
    excluding
    knowledge-graph.json
    itself (e.g.
    frontend-knowledge-graph.json
    ,
    backend-knowledge-graph.json
    ). If any subdomain graphs exist, run the merge script bundled with this skill (located next to this SKILL.md file — use the skill directory path, not the project root):

    python <SKILL_DIR>/merge-subdomain-graphs.py $PROJECT_ROOT
    

    The script discovers subdomain graphs, loads the existing

    knowledge-graph.json
    as a base (if present), and merges everything into
    knowledge-graph.json
    (deduplicating nodes and edges). Report the merge summary to the user, then continue with the merged graph.

  2. Check if

    $PROJECT_ROOT/.understand-anything/knowledge-graph.json
    exists. If it does, read it.

  3. Check if

    $PROJECT_ROOT/.understand-anything/meta.json
    exists. If it does, read it to get
    gitCommitHash
    .

  4. Decision logic:

    ConditionAction
    --full
    flag in
    $ARGUMENTS
    Full analysis (all phases)
    No existing graph or metaFull analysis (all phases)
    --review
    flag + existing graph + unchanged commit hash
    Skip to Phase 6 (review-only — reuse existing assembled graph)
    Existing graph + unchanged commit hashAsk the user: "The graph is up to date at this commit. Would you like to: (a) run a full rebuild (
    --full
    ), (b) run the LLM graph reviewer (
    --review
    ), or (c) do nothing?" Then follow their choice. If they pick (c), STOP.
    Existing graph + changed filesIncremental update (re-analyze changed files only)

    Review-only path: Copy the existing

    knowledge-graph.json
    to
    $PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json
    , then jump directly to Phase 6 step 3.

    For incremental updates, get the changed file list:

    git diff <lastCommitHash>..HEAD --name-only
    

    If this returns no files, report "Graph is up to date" and STOP.

  5. Collect project context for subagent injection:

    • Read
      README.md
      (or
      README.rst
      ,
      readme.md
      ) from
      $PROJECT_ROOT
      if it exists. Store as
      $README_CONTENT
      (first 3000 characters).
    • Read the primary package manifest (
      package.json
      ,
      pyproject.toml
      ,
      Cargo.toml
      ,
      go.mod
      ,
      pom.xml
      ) if it exists. Store as
      $MANIFEST_CONTENT
      .
    • Capture the top-level directory tree:
      find $PROJECT_ROOT -maxdepth 2 -type f -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*' | head -100
      
      Store as
      $DIR_TREE
      .
    • Detect the project entry point by checking for common patterns (in order):
      src/index.ts
      ,
      src/main.ts
      ,
      src/App.tsx
      ,
      index.js
      ,
      main.py
      ,
      manage.py
      ,
      app.py
      ,
      wsgi.py
      ,
      asgi.py
      ,
      run.py
      ,
      __main__.py
      ,
      main.go
      ,
      cmd/*/main.go
      ,
      src/main.rs
      ,
      src/lib.rs
      ,
      src/main/java/**/Application.java
      ,
      Program.cs
      ,
      config.ru
      ,
      index.php
      . Store first match as
      $ENTRY_POINT
      .

Phase 0.5 — Ignore Configuration

Set up and verify the

.understandignore
file before scanning.

  1. Check if
    $PROJECT_ROOT/.understand-anything/.understandignore
    exists.
  2. If it does NOT exist, generate a starter file:
    • Run the following Node.js one-liner in
      $PROJECT_ROOT
      (reads
      .gitignore
      and deduplicates against built-in defaults):
      node -e "
      const fs = require('fs');
      const path = require('path');
      const root = process.cwd();
      const defaults = ['node_modules/','node_modules','.git/','vendor/','venv/','.venv/','__pycache__/','dist/','dist','build/','build','out/','coverage/','coverage','.next/','.cache/','.turbo/','target/','obj/','*.lock','package-lock.json','yarn.lock','pnpm-lock.yaml','*.png','*.jpg','*.jpeg','*.gif','*.svg','*.ico','*.woff','*.woff2','*.ttf','*.eot','*.mp3','*.mp4','*.pdf','*.zip','*.tar','*.gz','*.min.js','*.min.css','*.map','*.generated.*','.idea/','.vscode/','LICENSE','.gitignore','.editorconfig','.prettierrc','.eslintrc*','*.log'];
      const norm = p => p.replace(/\/+$/, '');
      const defaultSet = new Set(defaults.map(norm));
      const header = '# .understandignore — patterns for files/dirs to exclude from analysis\n# Syntax: same as .gitignore (globs, # comments, ! negation, trailing / for dirs)\n# Lines below are suggestions — uncomment to activate.\n# Use ! prefix to force-include something excluded by defaults.\n#\n# Built-in defaults (always excluded unless negated):\n#   node_modules/, .git/, dist/, build/, obj/, *.lock, *.min.js, etc.\n#\n';
      let body = '';
      const gitignorePath = path.join(root, '.gitignore');
      if (fs.existsSync(gitignorePath)) {
        const gi = fs.readFileSync(gitignorePath, 'utf-8').split('\n').map(l => l.trim()).filter(l => l && !l.startsWith('#')).filter(p => !defaultSet.has(norm(p)));
        if (gi.length) { body += '# --- From .gitignore (uncomment to exclude) ---\n\n' + gi.map(p => '# ' + p).join('\n') + '\n\n'; }
      }
      const dirs = ['__tests__','test','tests','fixtures','testdata','docs','examples','scripts','migrations','.storybook'];
      const found = dirs.filter(d => fs.existsSync(path.join(root, d)));
      if (found.length) { body += '# --- Detected directories (uncomment to exclude) ---\n\n' + found.map(d => '# ' + d + '/').join('\n') + '\n\n'; }
      body += '# --- Test file patterns (uncomment to exclude) ---\n\n# *.test.*\n# *.spec.*\n# *.snap\n';
      const outDir = path.join(root, '.understand-anything');
      if (!fs.existsSync(outDir)) fs.mkdirSync(outDir, { recursive: true });
      fs.writeFileSync(path.join(outDir, '.understandignore'), header + body);
      "
      
    • Report to the user:

      Generated

      .understand-anything/.understandignore
      with suggested exclusions based on your project structure. Please review it and uncomment any patterns you'd like to exclude from analysis. When ready, confirm to continue.

    • Wait for user confirmation before proceeding.
  3. If it already exists, report:

    Found

    .understand-anything/.understandignore
    . Review it if needed, then confirm to continue.

    • Wait for user confirmation before proceeding.
  4. After confirmation, proceed to Phase 1.

Phase 1 — SCAN (Full analysis only)

Dispatch a subagent using the

project-scanner
agent definition (at
agents/project-scanner.md
). Append the following additional context:

Additional context from main session:

Project README (first 3000 chars):

$README_CONTENT

Package manifest:

$MANIFEST_CONTENT

Use this context to produce more accurate project name, description, and framework detection. The README and manifest are authoritative — prefer their information over heuristics.

Pass these parameters in the dispatch prompt:

Scan this project directory to discover all project files (including non-code files like configs, docs, infrastructure), detect languages and frameworks. Project root:

$PROJECT_ROOT
Write output to:
$PROJECT_ROOT/.understand-anything/intermediate/scan-result.json

After the subagent completes, read

$PROJECT_ROOT/.understand-anything/intermediate/scan-result.json
to get:

  • Project name, description
  • Languages, frameworks
  • File list with line counts and
    fileCategory
    per file (
    code
    ,
    config
    ,
    docs
    ,
    infra
    ,
    data
    ,
    script
    ,
    markup
    )
  • Complexity estimate
  • Import map (
    importMap
    ): pre-resolved project-internal imports per file (non-code files have empty arrays)

Store

importMap
in memory as
$IMPORT_MAP
for use in Phase 2 batch construction. Store the file list as
$FILE_LIST
with
fileCategory
metadata for use in Phase 2 batch construction.

Gate check: If >100 files, inform the user and suggest scoping with a subdirectory argument. Proceed only if user confirms or add guidance that this may take a while.

If the scan result includes

filteredByIgnore > 0
, report:

Excluded {filteredByIgnore} files via

.understandignore
.


Phase 2 — ANALYZE

Full analysis path

Batch the file list from Phase 1 into groups of 20-30 files each (aim for ~25 files per batch for balanced sizes).

Batching strategy for non-code files:

  • Group related non-code files together in the same batch when possible:
    • Dockerfile + docker-compose.yml + .dockerignore → same batch
    • SQL migration files → same batch (ordered by filename)
    • CI/CD config files (.github/workflows/*) → same batch
    • Documentation files (docs/*.md) → same batch
  • This allows the file-analyzer to create cross-file edges (e.g., docker-compose
    depends_on
    Dockerfile)
  • Non-code files can be mixed with code files in the same batch if batch sizes are small
  • Each file's
    fileCategory
    from Phase 1 must be included in the batch file list

For each batch, dispatch a subagent using the

file-analyzer
agent definition (at
agents/file-analyzer.md
). Run up to 5 subagents concurrently using parallel dispatch. Append the following additional context:

Additional context from main session:

Project:

<projectName>
<projectDescription>
Languages:
<languages from Phase 1>

Before dispatching each batch, construct

batchImportData
from
$IMPORT_MAP
:

batchImportData = {}
for each file in this batch:
  batchImportData[file.path] = $IMPORT_MAP[file.path] ?? []

Fill in batch-specific parameters below and dispatch:

Analyze these files and produce GraphNode and GraphEdge objects. Project root:

$PROJECT_ROOT
Project:
<projectName>
Languages:
<languages>
Batch index:
<batchIndex>
Skill directory (for bundled scripts):
<SKILL_DIR>
Write output to:
$PROJECT_ROOT/.understand-anything/intermediate/batch-<batchIndex>.json

Pre-resolved import data for this batch (use this for all import edge creation — do NOT re-resolve imports from source):

<batchImportData JSON>

Files to analyze in this batch:

  1. <path>
    (<sizeLines> lines, fileCategory:
    <fileCategory>
    )
  2. <path>
    (<sizeLines> lines, fileCategory:
    <fileCategory>
    ) ...

After ALL batches complete, run the merge-and-normalize script bundled with this skill (located next to this SKILL.md file — use the skill directory path, not the project root):

python <SKILL_DIR>/merge-batch-graphs.py $PROJECT_ROOT

This script reads all

batch-*.json
files from
$PROJECT_ROOT/.understand-anything/intermediate/
, then in one pass:

  • Combines all nodes and edges across batches
  • Normalizes node IDs (strips double prefixes, project-name prefixes, adds missing prefixes)
  • Normalizes complexity values (
    low
    simple
    ,
    medium
    moderate
    ,
    high
    complex
    , etc.)
  • Rewrites edge references to match corrected node IDs
  • Deduplicates nodes by ID (keeps last occurrence) and edges by
    (source, target, type)
  • Drops dangling edges referencing missing nodes
  • Logs all corrections and dropped items to stderr

Output:

$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json

Include the script's warnings in

$PHASE_WARNINGS
for the reviewer.

Incremental update path

Use the changed files list from Phase 0. Batch and dispatch file-analyzer subagents using the same process as above (20-30 files per batch, up to 5 concurrent, with batchImportData constructed from $IMPORT_MAP), but only for changed files.

After batches complete:

  1. Remove old nodes whose
    filePath
    matches any changed file from the existing graph
  2. Remove old edges whose
    source
    or
    target
    references a removed node
  3. Write the pruned existing nodes/edges as
    batch-existing.json
    in the intermediate directory
  4. Run the same merge script — it will combine
    batch-existing.json
    with the fresh
    batch-*.json
    files:
    python <SKILL_DIR>/merge-batch-graphs.py $PROJECT_ROOT
    

Phase 3 — ASSEMBLE REVIEW

Dispatch a subagent using the

assemble-reviewer
agent definition (at
agents/assemble-reviewer.md
).

Pass these parameters in the dispatch prompt:

Review the assembled graph at

$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json
. Project root:
$PROJECT_ROOT
Batch files are at:
$PROJECT_ROOT/.understand-anything/intermediate/batch-*.json
Write review output to:
$PROJECT_ROOT/.understand-anything/intermediate/assemble-review.json

Merge script report:

<paste the full stderr output from merge-batch-graphs.py>

Import map for cross-batch edge verification:

$IMPORT_MAP

After the subagent completes, read

$PROJECT_ROOT/.understand-anything/intermediate/assemble-review.json
and add any notes to
$PHASE_WARNINGS
.


Phase 4 — ARCHITECTURE

Build the combined prompt template:

  1. Use the
    architecture-analyzer
    agent definition (at
    agents/architecture-analyzer.md
    ).
  2. Language context injection: For each language detected in Phase 1 (e.g.,
    python
    ,
    markdown
    ,
    dockerfile
    ,
    yaml
    ,
    sql
    ,
    terraform
    ,
    graphql
    ,
    protobuf
    ,
    shell
    ,
    html
    ,
    css
    ), read the file at
    ./languages/<language-id>.md
    (e.g.,
    ./languages/python.md
    ,
    ./languages/dockerfile.md
    ) and append its content after the base template under a
    ## Language Context
    header. If the file does not exist for a detected language, skip it silently and continue. These files are in the
    languages/
    subdirectory next to this SKILL.md file. Include non-code language snippets — they provide edge patterns and summary styles for non-code files.
  3. Framework addendum injection: For each framework detected in Phase 1 (e.g.,
    Django
    ), read the file at
    ./frameworks/<framework-id-lowercase>.md
    (e.g.,
    ./frameworks/django.md
    ) and append its full content after the language context. If the file does not exist for a detected framework, skip it silently and continue. These files are in the
    frameworks/
    subdirectory next to this SKILL.md file.

Append the language/framework context and the following additional context to the agent's prompt:

Additional context from main session:

Frameworks detected:

<frameworks from Phase 1>

Directory tree (top 2 levels):

$DIR_TREE

Use the directory tree, language context, and framework addendums (appended above) to inform layer assignments. Directory structure is strong evidence for layer boundaries. Non-code files (config, docs, infrastructure, data) should be assigned to appropriate layers — see the prompt template for guidance.

Pass these parameters in the dispatch prompt:

Analyze this codebase's structure to identify architectural layers. Project root:

$PROJECT_ROOT
Write output to:
$PROJECT_ROOT/.understand-anything/intermediate/layers.json
Project:
<projectName>
<projectDescription>

File nodes (all node types — includes code files, config, document, service, pipeline, table, schema, resource, endpoint):

[list of {id, type, name, filePath, summary, tags} for ALL file-level nodes — omit complexity, languageNotes]

Import edges:

[list of edges with type "imports"]

All edges (for cross-category analysis — includes configures, documents, deploys, triggers, etc.):

[list of ALL edges — include all edge types]

After the subagent completes, read

$PROJECT_ROOT/.understand-anything/intermediate/layers.json
and normalize it into a final
layers
array. Apply these steps in order:

  1. Unwrap envelope: If the file contains
    { "layers": [...] }
    instead of a plain array, extract the inner array. (The prompt requests a plain array, but LLMs may still produce an envelope.)
  2. Rename legacy fields: If any layer object has a
    nodes
    field instead of
    nodeIds
    , rename
    nodes
    nodeIds
    . If
    nodes
    entries are objects with an
    id
    field rather than plain strings, extract just the
    id
    values into
    nodeIds
    .
  3. Synthesize missing IDs: If any layer is missing an
    id
    , generate one as
    layer:<kebab-case-name>
    .
  4. Convert file paths: If
    nodeIds
    entries are raw file paths without a known prefix (
    file:
    ,
    config:
    ,
    document:
    ,
    service:
    ,
    pipeline:
    ,
    table:
    ,
    schema:
    ,
    resource:
    ,
    endpoint:
    ), convert them to
    file:<relative-path>
    .
  5. Drop dangling refs: Remove any
    nodeIds
    entries that do not exist in the merged node set.

Each element of the final

layers
array MUST have this shape:

[
  {
    "id": "layer:<kebab-case-name>",
    "name": "<layer name>",
    "description": "<what belongs in this layer>",
    "nodeIds": ["file:src/App.tsx", "config:tsconfig.json", "document:README.md"]
  }
]

All four fields (

id
,
name
,
description
,
nodeIds
) are required.

For incremental updates: Always re-run architecture analysis on the full merged node set, since layer assignments may shift when files change.

Context for incremental updates: When re-running architecture analysis, also inject the previous layer definitions:

Previous layer definitions (for naming consistency):

[previous layers from existing graph]

Maintain the same layer names and IDs where possible. Only add/remove layers if the file structure has materially changed.


Phase 5 — TOUR

Dispatch a subagent using the

tour-builder
agent definition (at
agents/tour-builder.md
). Append the following additional context:

Additional context from main session:

Project README (first 3000 chars):

$README_CONTENT

Project entry point:

$ENTRY_POINT

Use the README to align the tour narrative with the project's own documentation. Start the tour from the entry point if one was detected. The tour should tell the same story the README tells, but through the lens of actual code structure.

Pass these parameters in the dispatch prompt:

Create a guided learning tour for this codebase. Project root:

$PROJECT_ROOT
Write output to:
$PROJECT_ROOT/.understand-anything/intermediate/tour.json
Project:
<projectName>
<projectDescription>
Languages:
<languages>

Nodes (all file-level nodes — includes code files, config, document, service, pipeline, table, schema, resource, endpoint):

[list of {id, name, filePath, summary, type} for ALL file-level nodes — do NOT include function or class nodes]

Layers:

[list of {id, name, description} for each layer — omit nodeIds]

Edges (all types — includes imports, calls, configures, documents, deploys, triggers, etc.):

[list of ALL edges — include all edge types for complete graph topology analysis]

After the subagent completes, read

$PROJECT_ROOT/.understand-anything/intermediate/tour.json
and normalize it into a final
tour
array. Apply these steps in order:

  1. Unwrap envelope: If the file contains
    { "steps": [...] }
    instead of a plain array, extract the inner array. (The prompt requests a plain array, but LLMs may still produce an envelope.)
  2. Rename legacy fields: If any step has
    nodesToInspect
    instead of
    nodeIds
    , rename it →
    nodeIds
    . If any step has
    whyItMatters
    instead of
    description
    , rename it →
    description
    .
  3. Convert file paths: If
    nodeIds
    entries are raw file paths without a known prefix (
    file:
    ,
    config:
    ,
    document:
    ,
    service:
    ,
    pipeline:
    ,
    table:
    ,
    schema:
    ,
    resource:
    ,
    endpoint:
    ), convert them to
    file:<relative-path>
    .
  4. Drop dangling refs: Remove any
    nodeIds
    entries that do not exist in the merged node set.
  5. Sort by
    order
    before saving.

Each element of the final

tour
array MUST have this shape:

[
  {
    "order": 1,
    "title": "Project Overview",
    "description": "Start with the README to understand the project's purpose and architecture.",
    "nodeIds": ["document:README.md"]
  },
  {
    "order": 2,
    "title": "Application Entry Point",
    "description": "This step explains how the frontend boots and mounts.",
    "nodeIds": ["file:src/main.tsx", "file:src/App.tsx"]
  }
]

Required fields:

order
,
title
,
description
,
nodeIds
. Preserve optional
languageLesson
when present.


Phase 6 — REVIEW

Assemble the full KnowledgeGraph JSON object:

{
  "version": "1.0.0",
  "project": {
    "name": "<projectName>",
    "languages": ["<languages>"],
    "frameworks": ["<frameworks>"],
    "description": "<projectDescription>",
    "analyzedAt": "<ISO 8601 timestamp>",
    "gitCommitHash": "<commit hash from Phase 0>"
  },
  "nodes": [<all nodes from assembled-graph.json after Phase 3 review>],
  "edges": [<all edges from assembled-graph.json after Phase 3 review>],
  "layers": [<layers from Phase 4>],
  "tour": [<steps from Phase 5>]
}
  1. Before writing the assembled graph, validate that:

    • layers
      is an array of objects with these required fields:
      id
      ,
      name
      ,
      description
      ,
      nodeIds
    • tour
      is an array of objects with these required fields:
      order
      ,
      title
      ,
      description
      ,
      nodeIds
    • tour[*].languageLesson
      is allowed as an optional string field
    • Every
      layers[*].nodeIds
      entry exists in the merged node set
    • Every
      tour[*].nodeIds
      entry exists in the merged node set

    If validation fails, automatically normalize and rewrite the graph into this shape before saving. If the graph still fails final validation after the normalization pass, save it with warnings but mark dashboard auto-launch as skipped.

  2. Write the assembled graph to

    $PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json
    .

  3. Check

    $ARGUMENTS
    for
    --review
    flag.
    Then run the appropriate validation path:


Default path (no
--review
): inline deterministic validation

Write the following Node.js script to

$PROJECT_ROOT/.understand-anything/tmp/ua-inline-validate.cjs
:

#!/usr/bin/env node
const fs = require('fs');
const graphPath = process.argv[2];
const outputPath = process.argv[3];
try {
  const graph = JSON.parse(fs.readFileSync(graphPath, 'utf8'));
  const issues = [], warnings = [];
  if (!Array.isArray(graph.nodes)) { issues.push('graph.nodes is missing or not an array'); graph.nodes = []; }
  if (!Array.isArray(graph.edges)) { issues.push('graph.edges is missing or not an array'); graph.edges = []; }
  const nodeIds = new Set();
  const seen = new Map();
  graph.nodes.forEach((n, i) => {
    if (!n.id) { issues.push(`Node[${i}] missing id`); return; }
    if (!n.type) issues.push(`Node[${i}] '${n.id}' missing type`);
    if (!n.name) issues.push(`Node[${i}] '${n.id}' missing name`);
    if (!n.summary) issues.push(`Node[${i}] '${n.id}' missing summary`);
    if (!n.tags || !n.tags.length) issues.push(`Node[${i}] '${n.id}' missing tags`);
    if (seen.has(n.id)) issues.push(`Duplicate node ID '${n.id}' at indices ${seen.get(n.id)} and ${i}`);
    else seen.set(n.id, i);
    nodeIds.add(n.id);
  });
  graph.edges.forEach((e, i) => {
    if (!nodeIds.has(e.source)) issues.push(`Edge[${i}] source '${e.source}' not found`);
    if (!nodeIds.has(e.target)) issues.push(`Edge[${i}] target '${e.target}' not found`);
  });
  const fileLevelTypes = new Set(['file', 'config', 'document', 'service', 'pipeline', 'table', 'schema', 'resource', 'endpoint']);
  const fileNodes = graph.nodes.filter(n => fileLevelTypes.has(n.type)).map(n => n.id);
  const assigned = new Map();
  if (!Array.isArray(graph.layers)) { if (graph.layers) warnings.push('graph.layers is not an array'); graph.layers = []; }
  if (!Array.isArray(graph.tour)) { if (graph.tour) warnings.push('graph.tour is not an array'); graph.tour = []; }
  graph.layers.forEach(layer => {
    (layer.nodeIds || []).forEach(id => {
      if (!nodeIds.has(id)) issues.push(`Layer '${layer.id}' refs missing node '${id}'`);
      if (assigned.has(id)) issues.push(`Node '${id}' appears in multiple layers`);
      assigned.set(id, layer.id);
    });
  });
  fileNodes.forEach(id => {
    if (!assigned.has(id)) issues.push(`File node '${id}' not in any layer`);
  });
  graph.tour.forEach((step, i) => {
    (step.nodeIds || []).forEach(id => {
      if (!nodeIds.has(id)) issues.push(`Tour step[${i}] refs missing node '${id}'`);
    });
  });
  const withEdges = new Set([
    ...graph.edges.map(e => e.source),
    ...graph.edges.map(e => e.target)
  ]);
  graph.nodes.forEach(n => {
    if (!withEdges.has(n.id)) warnings.push(`Node '${n.id}' has no edges (orphan)`);
  });
  const stats = {
    totalNodes: graph.nodes.length,
    totalEdges: graph.edges.length,
    totalLayers: graph.layers.length,
    tourSteps: graph.tour.length,
    nodeTypes: graph.nodes.reduce((a, n) => { a[n.type] = (a[n.type]||0)+1; return a; }, {}),
    edgeTypes: graph.edges.reduce((a, e) => { a[e.type] = (a[e.type]||0)+1; return a; }, {})
  };
  fs.writeFileSync(outputPath, JSON.stringify({ issues, warnings, stats }, null, 2));
  process.exit(0);
} catch (err) { process.stderr.write(err.message + '\n'); process.exit(1); }

Execute it:

node $PROJECT_ROOT/.understand-anything/tmp/ua-inline-validate.cjs \
  "$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json" \
  "$PROJECT_ROOT/.understand-anything/intermediate/review.json"

If the script exits non-zero, read stderr, fix the script, and retry once.


--review
path: full LLM reviewer

If

--review
IS in
$ARGUMENTS
, dispatch the LLM graph-reviewer subagent as follows:

Dispatch a subagent using the

graph-reviewer
agent definition (at
agents/graph-reviewer.md
). Append the following additional context:

Additional context from main session:

Phase 1 scan results (file inventory):

[list of {path, sizeLines} from scan-result.json]

Phase warnings/errors accumulated during analysis:

  • [list any batch failures, skipped files, or warnings from Phases 2-5]

Cross-validate: every file in the scan inventory should have a corresponding node in the graph (node types may vary:

file:
,
config:
,
document:
,
service:
,
pipeline:
,
table:
,
schema:
,
resource:
,
endpoint:
). Flag any missing files. Also flag any graph nodes whose
filePath
doesn't appear in the scan inventory.

Pass these parameters in the dispatch prompt:

Validate the knowledge graph at

$PROJECT_ROOT/.understand-anything/intermediate/assembled-graph.json
. Project root:
$PROJECT_ROOT
Read the file and validate it for completeness and correctness. Write output to:
$PROJECT_ROOT/.understand-anything/intermediate/review.json


  1. Read

    $PROJECT_ROOT/.understand-anything/intermediate/review.json
    .

  2. If

    issues
    array is non-empty:

    • Review the
      issues
      list
    • Apply automated fixes where possible:
      • Remove edges with dangling references
      • Fill missing required fields with sensible defaults (e.g., empty
        tags
        ->
        ["untagged"]
        , empty
        summary
        ->
        "No summary available"
        )
      • Remove nodes with invalid types
    • Re-run the final graph validation after automated fixes
    • If critical issues remain after one fix attempt, save the graph anyway but include the warnings in the final report and mark dashboard auto-launch as skipped
  3. If

    issues
    array is empty: Proceed to Phase 7.


Phase 7 — SAVE

  1. Write the final knowledge graph to

    $PROJECT_ROOT/.understand-anything/knowledge-graph.json
    .

  2. Write metadata to

    $PROJECT_ROOT/.understand-anything/meta.json
    :

    {
      "lastAnalyzedAt": "<ISO 8601 timestamp>",
      "gitCommitHash": "<commit hash>",
      "version": "1.0.0",
      "analyzedFiles": <number of files analyzed>
    }
    

2.5. Generate structural fingerprints for all analyzed files and save to

$PROJECT_ROOT/.understand-anything/fingerprints.json
. This creates the baseline for future automatic incremental updates.

Write and execute a Node.js script that uses the core fingerprint module (tree-sitter-based, not regex):

import { buildFingerprintStore } from '@understand-anything/core';
import { saveFingerprints } from '@understand-anything/core';

const store = await buildFingerprintStore('<PROJECT_ROOT>', sourceFilePaths);
saveFingerprints('<PROJECT_ROOT>', store);

Where

sourceFilePaths
is the list of all analyzed source file paths from Phase 1. This uses the same tree-sitter analysis pipeline as the main fingerprint engine, ensuring the baseline matches the comparison logic used during auto-updates.

  1. Clean up intermediate files:

    rm -rf $PROJECT_ROOT/.understand-anything/intermediate
    rm -rf $PROJECT_ROOT/.understand-anything/tmp
    
  2. Report a summary to the user containing:

    • Project name and description
    • Files analyzed / total files (with breakdown by fileCategory: code, config, docs, infra, data, script, markup)
    • Nodes created (broken down by type: file, function, class, config, document, service, table, endpoint, pipeline, schema, resource)
    • Edges created (broken down by type)
    • Layers identified (with names)
    • Tour steps generated (count)
    • Any warnings from the reviewer
    • Path to the output file:
      $PROJECT_ROOT/.understand-anything/knowledge-graph.json
  3. Only automatically launch the dashboard by invoking the

    /understand-dashboard
    skill if final graph validation passed after normalization/review fixes. If final validation did not pass, report that the graph was saved with warnings and dashboard launch was skipped.


Error Handling

  • If any subagent dispatch fails, retry once with the same prompt plus additional context about the failure.
  • Track all warnings and errors from each phase in a
    $PHASE_WARNINGS
    list. When using
    --review
    , pass this list to the graph-reviewer in Phase 6. On the default path, include accumulated warnings in the Phase 7 final report.
  • If it fails a second time, skip that phase and continue with partial results.
  • ALWAYS save partial results — a partial graph is better than no graph.
  • Report any skipped phases or errors in the final summary so the user knows what happened.
  • NEVER silently drop errors. Every failure must be visible in the final report.

Reference: KnowledgeGraph Schema

Node Types (13 total)

TypeDescriptionID Convention
file
Source code file
file:<relative-path>
function
Function or method
function:<relative-path>:<name>
class
Class, interface, or type
class:<relative-path>:<name>
module
Logical module or package
module:<name>
concept
Abstract concept or pattern
concept:<name>
config
Configuration file (YAML, JSON, TOML, env)
config:<relative-path>
document
Documentation file (Markdown, RST, TXT)
document:<relative-path>
service
Deployable service definition (Dockerfile, K8s)
service:<relative-path>
table
Database table or migration
table:<relative-path>:<table-name>
endpoint
API endpoint or route definition
endpoint:<relative-path>:<endpoint-name>
pipeline
CI/CD pipeline configuration
pipeline:<relative-path>
schema
Schema definition (GraphQL, Protobuf, Prisma)
schema:<relative-path>
resource
Infrastructure resource (Terraform, CloudFormation)
resource:<relative-path>

Edge Types (26 total)

CategoryTypes
Structural
imports
,
exports
,
contains
,
inherits
,
implements
Behavioral
calls
,
subscribes
,
publishes
,
middleware
Data flow
reads_from
,
writes_to
,
transforms
,
validates
Dependencies
depends_on
,
tested_by
,
configures
Semantic
related
,
similar_to
Infrastructure
deploys
,
serves
,
provisions
,
triggers
Schema/Data
migrates
,
documents
,
routes
,
defines_schema

Edge Weight Conventions

Edge TypeWeight
contains
1.0
inherits
,
implements
0.9
calls
,
exports
,
defines_schema
0.8
imports
,
deploys
,
migrates
0.7
depends_on
,
configures
,
triggers
0.6
tested_by
,
documents
,
provisions
,
serves
,
routes
0.5
All others0.5 (default)