Session-orchestrator discovery
git clone https://github.com/Kanevry/session-orchestrator
T=$(mktemp -d) && git clone --depth=1 https://github.com/Kanevry/session-orchestrator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/discovery" ~/.claude/skills/kanevry-session-orchestrator-discovery && rm -rf "$T"
skills/discovery/SKILL.mdDiscovery Skill
Invocation Modes
Two modes of operation:
- Standalone (
): Full 6-phase flow with interactive triage (Phases 0-6)/discovery [scope] - Embedded (from session-end when
): Phases 0-4 only, returns structured findings to session-enddiscovery-on-close: true
The
scope argument accepts: all (default), code, infra, ui, arch, session, or comma-separated like code,session.
Phase 0: Bootstrap Gate
Read
skills/_shared/bootstrap-gate.md and execute the gate check. If the gate is CLOSED, invoke skills/bootstrap/SKILL.md and wait for completion before proceeding. If the gate is OPEN, continue to Phase 1.
<HARD-GATE>
Do NOT proceed past Phase 0 if GATE_CLOSED. There is no bypass. Refer to `skills/_shared/bootstrap-gate.md` for the full HARD-GATE constraints.
</HARD-GATE>
Phase 1: Read Session Config
Read and parse Session Config per
skills/_shared/config-reading.md. Store result as $CONFIG.
Discovery-relevant fields (parse these specifically):
,discovery-on-close
,discovery-probes
,discovery-exclude-paths
,discovery-severity-thresholddiscovery-confidence-threshold
,test-command
,typecheck-commandlint-command
,pencil
,vcs
,cross-reposstale-issue-days
Phase 2: Stack Detection & Probe Activation
Detect the project's tech stack via marker file checks. Use Glob and run checks in parallel:
| Marker File(s) | Activates |
|---|---|
| JS/TS probes |
| TypeScript probes |
/ | Python probes |
/ | Container probes |
/ | Vercel probes |
| GitHub CI probes |
| GitLab CI probes |
| Supabase probes |
/ | SSR probes |
| Tailwind probes |
| Pencil in Session Config | design-drift probe |
Build Activation Set
- Start with all probes whose marker files are present
- If
is set in config, intersect with that listdiscovery-probes - If a
argument was passed, restrict to that categoryscope - Remove probes whose activation conditions are not met
Exclude Paths
Default exclude paths (always apply):
,node_modules/
,.git/
,dist/
,build/
,.next/
,.nuxt/coverage/
Add any paths from
discovery-exclude-paths in Session Config.
VCS Detection
VCS Reference: Detect the VCS platform per the "VCS Auto-Detection" section of the gitlab-ops skill.
Status Report
Report: "Discovery: [N] probes active across [categories]. Stack: [detected]. Threshold: [severity]."
Phase 3: Probe Execution
Dispatch probe agents IN PARALLEL using the Agent tool. Group by category (max 5 agents):
Cursor IDE: No Agent() tool available. Run probes sequentially within the current session — one category at a time. Complete each category's analysis before moving to the next.
- Code probes agent: Runs all activated code probes (hardcoded-values, orphaned-annotations, dead-code, ai-slop, type-safety-gaps, test-coverage-gaps, test-anti-patterns, security-basics)
- Infra probes agent: Runs all activated infra probes
- UI probes agent: Runs all activated UI probes
- Arch probes agent: Runs all activated arch probes
- Session probes agent: Runs all activated session probes
Each agent receives:
- The probe definitions from
(confidence scoring reference) AND the category-specificprobes-intro.md
file for this agent's category (include the actual grep commands/patterns in the prompt)probes-<category>.md - The exclude paths list
- The project root path
- Tools: Read, Grep, Glob, Bash (read-only -- no Edit/Write)
- Instruction: "Run each probe. For each finding, output EXACTLY this format:"
FINDING: probe: <probe_name> category: <category> severity: <critical|high|medium|low> file_path: <absolute path> line_number: <number> matched_text: <exact text from tool output> title: <short title for the finding> description: <1-2 sentence description> recommended_fix: <concrete fix suggestion>
"If a probe's activation condition is not met, skip it with: SKIPPED: <probe_name> -- <reason>" "If a probe command fails, skip it with: FAILED: <probe_name> -- <error>" "Do NOT fabricate findings. Only report what tool output confirms."
CRITICAL:
run_in_background: false for all agents.
Skip categories with no activated probes (don't dispatch empty agents).
Phase 4: Verification & Scoring
After all probe agents complete:
4.1 Parse Findings
Collect all
FINDING: blocks from agent outputs into a unified findings list.
4.2 Verification Pass
For EACH finding:
- Read the file at
using the Read toolfile_path:line_number - Confirm
appears at or near that line (+/-3 lines tolerance)matched_text - If NOT confirmed, discard with note "false positive -- text not found at reported location"
- Report: "Verification: N confirmed, M discarded as false positives"
4.2a Confidence Scoring
For each verified finding, assign a confidence score (0-100) based on three factors:
| Factor | Low (+0) | Medium (+10) | High (+20) |
|---|---|---|---|
| Pattern specificity | Generic match (URL, TODO) | Moderate (orphaned annotation, magic number) | Specific (API key regex, eval(), SQL injection) |
| File context | Test fixture, example, seed data, docs | Utility, config, scripts | Production source, API handler, middleware |
| Historical signal | Previously dismissed as false positive | No prior data (first occurrence) | Recurring issue (confirmed in learnings.jsonl) |
Scoring rules:
- Start at 40 (baseline)
- Add each factor's score (+0/+10/+20)
- Clamp to 0-100 range
- Critical severity override: findings with severity
get a minimum confidence of 70 — they are NEVER auto-deferredcritical
Threshold: Read
discovery-confidence-threshold from Session Config (default: 60). If not configured, use 60.
Annotate each finding with its confidence score for Phase 5 presentation.
4.3 Deduplication
Two findings are duplicates if:
- Same
ANDfile_path - Overlapping line range (+/-5 lines) AND
- Different probes
Keep the higher severity finding. Merge descriptions.
4.4 Apply Thresholds
- Severity filter: Remove findings below
from Session Config.discovery-severity-threshold - Confidence filter: Remove findings with confidence score below
(default: 60). Log filtered-out findings: "Auto-dismissed N low-confidence findings (below threshold [T]). Usediscovery-confidence-threshold
to see all."discovery-confidence-threshold: 0
4.5 Group by Category
Group remaining findings by category for Phase 5 presentation.
4.6 Embedded Mode Exit
If in embedded mode (called from session-end): STOP HERE. Return structured findings to the caller using this schema:
Embedded mode return schema:
{ "findings": [ {"probe": "string", "category": "string", "severity": "critical|high|medium|low", "confidence": 0-100, "file": "string", "line": number, "description": "string", "recommendation": "string"} ], "stats": { "probes_run": number, "findings_raw": number, "findings_verified": number, "false_positives": number, "user_dismissed": 0, "issues_created": 0, "by_category": {"<category>": {"findings": number, "actioned": 0}} } }
Present both as structured data in your final output. Do not proceed to Phase 5.
Phase 5: Interactive Triage (Standalone Mode Only)
5.0 Auto-Defer Low-Confidence Findings
Before presenting findings for triage, separate by confidence threshold:
- Findings with confidence >= threshold → present for interactive triage (below)
- Findings with confidence < threshold → auto-defer with summary:
"Auto-deferred [N] low-confidence findings (score < [threshold]). Review with
."/discovery --include-deferred - List auto-deferred findings in a collapsed section (not interactive — informational only)
5.1 Present High-Confidence Findings
Present findings using AskUserQuestion -- NEVER plain text options. On Codex CLI where AskUserQuestion is unavailable, present as numbered Markdown lists.
Include confidence scores in the presentation:
[CRITICAL] (confidence: 85) hardcoded-values: API key found in src/config.ts:42 [HIGH] (confidence: 72) security-basics: eval() usage in src/utils/parser.ts:18 [MEDIUM] (confidence: 61) orphaned-annotations: TODO without issue in src/lib/auth.ts:55
Step 1: Summary
Present a findings overview table:
## Discovery Results Probes run: [N] | Findings verified: [N] | False positives discarded: [N] | Category | Critical | High | Medium | Low | Total | |----------|----------|------|--------|-----|-------| | Code | ... | ... | ... | ... | ... | | Infra | ... | ... | ... | ... | ... | | UI | ... | ... | ... | ... | ... | | Arch | ... | ... | ... | ... | ... | | Session | ... | ... | ... | ... | ... |
Step 2: Critical + High Findings -- Review Individually
For each Critical or High finding, use AskUserQuestion (on Codex CLI where AskUserQuestion is unavailable, present as numbered Markdown lists):
AskUserQuestion({ questions: [{ question: "<finding title>\n\n<file_path>:<line_number>\n```\n<matched_text with +/-3 lines context>\n```\n\n<description>\n\nRecommended fix: <recommended_fix>", header: "<severity>", options: [ { label: "Create issue (<severity>)", description: "Create a priority:<severity> issue for this finding" }, { label: "Adjust priority", description: "Create issue with different priority" }, { label: "Dismiss -- intentional", description: "This is by design, skip" }, { label: "Dismiss -- false positive", description: "Detection was wrong, skip" } ] }] })
If user selects "Adjust priority", ask which priority with another AskUserQuestion. On Codex CLI where AskUserQuestion is unavailable, present as numbered Markdown lists.
Step 3: Medium + Low Findings -- Review Batched
Group remaining findings by category. For each category with medium/low findings (on Codex CLI where AskUserQuestion is unavailable, present as numbered Markdown lists):
AskUserQuestion({ questions: [{ question: "[N] medium/low findings in [category]:\n\n1. [title] -- [file_path]:[line] ([severity])\n2. [title] -- [file_path]:[line] ([severity])\n...", header: "[Category]", options: [ { label: "Accept all (Recommended)", description: "Create issues for all [N] findings" }, { label: "Review individually", description: "Walk through each finding one by one" }, { label: "Dismiss all", description: "Skip all medium/low findings in this category" } ] }] })
If "Review individually" selected, walk through each like Step 2.
Step 4: Batch Confirmation
Before creating any issues (on Codex CLI where AskUserQuestion is unavailable, present as numbered Markdown lists):
AskUserQuestion({ questions: [{ question: "Ready to create [N] issues?\n\n- [X] critical\n- [Y] high\n- [Z] medium\n- [W] low", header: "Confirm", options: [ { label: "Create all [N] issues", description: "Proceed with issue creation" }, { label: "Review list first", description: "Show full list before creating" }, { label: "Cancel", description: "Do not create any issues" } ] }] })
Phase 6: Issue Creation & Report
6.1 Issue Creation
VCS Reference: Detect the VCS platform per the "VCS Auto-Detection" section of the gitlab-ops skill. Use CLI commands per the "Common CLI Commands" section.
For each approved finding:
- Format using the Discovery Finding Issue Template from
issue-templates.md - Determine labels:
+type:discovery
+priority:<level>
+area:<inferred from category/filepath>status:ready - Create issue via VCS CLI:
- GitLab:
glab issue create --title "[Discovery] <title>" --label "type:discovery,priority:<level>,area:<area>,status:ready" --description "<body>" - GitHub:
gh issue create --title "[Discovery] <title>" --label "type:discovery,priority:<level>,area:<area>,status:ready" --body "<body>"
- GitLab:
- Brief pause (1s) between creations for rate limiting
6.2 Final Report
## Discovery Report ### Summary - Probes run: [N] across [categories] - Raw findings: [N] - Verified: [N] (false positives discarded: [M]) - User approved: [N] - Issues created: [N] ### Created Issues | # | Title | Priority | Area | Probe | |---|-------|----------|------|-------| | <IID> | <title> | <priority> | <area> | <probe> | ### Dismissed Findings - [N] dismissed as intentional - [M] dismissed as false positive ### Recommendations - [suggestions based on finding patterns]
Critical Rules
- NEVER fabricate findings -- every finding must come from tool output with verifiable evidence
- NEVER create issues without user approval (standalone mode)
- ALWAYS verify findings by re-reading the file at the reported line (Phase 4)
- ALWAYS use AskUserQuestion for triage decisions -- never plain text options (On Codex CLI, use numbered lists as AskUserQuestion is not available)
- ALWAYS reference gitlab-ops for VCS operations -- never duplicate CLI logic inline
- If a probe command fails or tool is unavailable, skip gracefully, continue with others
- If in embedded mode, stop after Phase 4, return findings as structured data
- Respect
-- filter before presenting to userdiscovery-severity-threshold - Default exclude paths always apply:
,node_modules/
,.git/
,dist/
,build/
,.next/
,.nuxt/coverage/
Phase 7: Capture Discovery Stats (Standalone Mode Only)
This phase runs only in standalone mode. Embedded mode returns findings to the caller.
After Phase 6 (Issue Creation) completes, prepare discovery statistics for session metrics:
-
Count totals from the triage results:
: number of probes that were activated and executedprobes_run
: total findings before verificationfindings_raw
: findings that passed Phase 4.2 verificationfindings_verified
: findings discarded during verificationfalse_positives
: findings the user declined during Phase 5 triageuser_dismissed
: issues created in Phase 6issues_created
: per-category breakdown of findings and actioned itemsby_category
-
Report stats summary:
Discovery stats: [probes_run] probes, [findings_raw] raw → [findings_verified] verified ([false_positives] false positives). User dismissed [user_dismissed]. Created [issues_created] issues. -
These stats are available for session-end to include in
under thesessions.jsonl
field. The discovery skill does NOT write todiscovery_stats
directly — session-end handles that.sessions.jsonl
Anti-Patterns
- DO NOT report findings without verifying them first — false positives erode user trust faster than missed issues
- DO NOT skip the interactive triage phase — auto-creating issues for unconfirmed findings creates noise
- DO NOT run probes outside the configured scope — if the user asked for
scope, don't scan infrastructurecode - DO NOT present findings without evidence (file path, line number, actual content) — assertions without proof are useless
- DO NOT write to
directly — session-end handles metrics persistencesessions.jsonl