Oh-my-codex deep-interview
Socratic deep interview with mathematical ambiguity gating before execution
git clone https://github.com/Yeachan-Heo/oh-my-codex
T=$(mktemp -d) && git clone --depth=1 https://github.com/Yeachan-Heo/oh-my-codex "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/deep-interview" ~/.claude/skills/yeachan-heo-oh-my-codex-deep-interview && rm -rf "$T"
skills/deep-interview/SKILL.md<Use_When>
- The request is broad, ambiguous, or missing concrete acceptance criteria
- The user says "deep interview", "interview me", "ask me everything", "don't assume", or "ouroboros"
- The user wants to avoid misaligned implementation from underspecified requirements
- You need a requirements artifact before handing off to
,ralplan
,autopilot
, orralph
</Use_When>team
<Do_Not_Use_When>
- The request already has concrete file/symbol targets and clear acceptance criteria
- The user explicitly asks to skip planning/interview and execute immediately
- The user asks for lightweight brainstorming only (use
instead)plan - A complete PRD/plan already exists and execution should start </Do_Not_Use_When>
<Why_This_Exists> Execution quality is usually bottlenecked by intent clarity, not just missing implementation detail. A single expansion pass often misses why the user wants a change, where the scope should stop, which tradeoffs are unacceptable, and which decisions still require user approval. This workflow applies Socratic pressure + quantitative ambiguity scoring so orchestration modes begin with an explicit, testable, intent-aligned spec. </Why_This_Exists>
<Depth_Profiles>
- Quick (
): fast pre-PRD pass; target threshold--quick
; max rounds 5<= 0.30 - Standard (
, default): full requirement interview; target threshold--standard
; max rounds 12<= 0.20 - Deep (
): high-rigor exploration; target threshold--deep
; max rounds 20<= 0.15 - Autoresearch (
): same interview rigor as Standard, but specialized for--autoresearch
mission readiness and$autoresearch
artifact handoff.omx/specs/
If no flag is provided, use Standard.
<Mode_Flags>
: switch the interview into autoresearch-intake mode for--autoresearch
handoff. In this mode, the interview should converge on a validator-ready research mission, write canonical artifacts under$autoresearch
, and preserve the explicit.omx/specs/
vsrefine further
boundary for downstream skill intake. </Mode_Flags> </Depth_Profiles>launch
<Execution_Policy>
- Ask ONE question per round (never batch)
- Ask about intent and boundaries before implementation detail
- Target the weakest clarity dimension each round after applying the stage-priority rules below
- Treat every answer as a claim to pressure-test before moving on: the next question should usually demand evidence or examples, expose a hidden assumption, force a tradeoff or boundary, or reframe root cause vs symptom
- Do not rotate to a new clarity dimension just for coverage when the current answer is still vague; stay on the same thread until one layer deeper, one assumption clearer, or one boundary tighter
- Before crystallizing, complete at least one explicit pressure pass that revisits an earlier answer with a deeper, assumption-focused, or tradeoff-focused follow-up
- Gather codebase facts via
before asking user about internalsexplore - When session guidance enables
, preferUSE_OMX_EXPLORE_CMD
for simple read-only brownfield fact gathering; keep prompts narrow and concrete, and keep ambiguous or non-shell-only investigation on the richer normal path and fall back normally ifomx explore
is unavailable.omx explore - Always run a preflight context intake before the first interview question
- Reduce user effort: ask only the highest-leverage unresolved question, and never ask the user for codebase facts that can be discovered directly
- For brownfield work, prefer evidence-backed confirmation questions such as "I found X in Y. Should this change follow that pattern?"
- In Codex CLI, deep-interview uses
as the required OMX-owned structured questioning path for every interview roundomx question - If
is unavailable in the current runtime, treat that as a blocker/error for deep-interview rather than falling back toomx question
or plain-text questioningrequest_user_input - Re-score ambiguity after each answer and show progress transparently
- Do not hand off to execution while ambiguity remains above threshold unless user explicitly opts to proceed with warning
- Do not crystallize or hand off while
orNon-goals
remain unresolved, even if the weighted ambiguity threshold is metDecision Boundaries - Treat early exit as a safety valve, not the default success path
- Persist mode state for resume safety (
/state_write
) </Execution_Policy>state_read
Phase 0: Preflight Context Intake
- Parse
and derive a short task slug.{{ARGUMENTS}} - Attempt to load the latest relevant context snapshot from
..omx/context/{slug}-*.md - If no snapshot exists, create a minimum context snapshot with:
- Task statement
- Desired outcome
- Stated solution (what the user asked for)
- Probable intent hypothesis (why they likely want it)
- Known facts/evidence
- Constraints
- Unknowns/open questions
- Decision-boundary unknowns
- Likely codebase touchpoints
- Save snapshot to
(UTC.omx/context/{slug}-{timestamp}.md
) and reference it in mode state.YYYYMMDDTHHMMSSZ
Phase 1: Initialize
- Parse
and depth profile ({{ARGUMENTS}}
).--quick|--standard|--deep - Detect project context:
- Run
to classify brownfield (existing codebase target) vs greenfield.explore - For brownfield, collect relevant codebase context before questioning.
- Run
- Initialize state via
:state_write(mode="deep-interview")
{ "active": true, "current_phase": "deep-interview", "state": { "interview_id": "<uuid>", "profile": "quick|standard|deep", "type": "greenfield|brownfield", "initial_idea": "<user input>", "rounds": [], "current_ambiguity": 1.0, "threshold": 0.3, "max_rounds": 5, "challenge_modes_used": [], "codebase_context": null, "current_stage": "intent-first", "current_focus": "intent", "context_snapshot_path": ".omx/context/<slug>-<timestamp>.md" } }
- Announce kickoff with profile, threshold, and current ambiguity.
Phase 2: Socratic Interview Loop
Repeat until ambiguity
<= threshold, the pressure pass is complete, the readiness gates are explicit, the user exits with warning, or max rounds are reached.
2a) Generate next question
Use:
- Original idea
- Prior Q&A rounds
- Current dimension scores
- Brownfield context (if any)
- Activated challenge mode injection (Phase 3)
Target the lowest-scoring dimension, but respect stage priority:
- Stage 1 — Intent-first: Intent, Outcome, Scope, Non-goals, Decision Boundaries
- Stage 2 — Feasibility: Constraints, Success Criteria
- Stage 3 — Brownfield grounding: Context Clarity (brownfield only)
Follow-up pressure ladder after each answer:
- Ask for a concrete example, counterexample, or evidence signal behind the latest claim
- Probe the hidden assumption, dependency, or belief that makes the claim true
- Force a boundary or tradeoff: what would you explicitly not do, defer, or reject?
- If the answer still describes symptoms, reframe toward essence / root cause before moving on
Prefer staying on the same thread for multiple rounds when it has the highest leverage. Breadth without pressure is not progress.
Detailed dimensions:
- Intent Clarity — why the user wants this
- Outcome Clarity — what end state they want
- Scope Clarity — how far the change should go
- Constraint Clarity — technical or business limits that must hold
- Success Criteria Clarity — how completion will be judged
- Context Clarity — existing codebase understanding (brownfield only)
Non-goals and Decision Boundaries are mandatory readiness gates. Ask about them early and keep revisiting them until they are explicit.
2b) Ask the question
Use OMX-owned structured questioning via
omx question for every interview round (this is the required AskUserQuestion equivalent for deep-interview) and present:
Round {n} | Target: {weakest_dimension} | Ambiguity: {score}% {question}
omx question payload guidance for interview rounds:
- Use canonical
values instead of authoring rawtype
flags by hand.multi_select
is the default for one-path decisions;type: "single-answerable"
is the canonical shape for bounded multi-select rounds. The runtime will keeptype: "multi-answerable"
aligned withmulti_select
.type - Use
when exactly one answer should drive the next branch, the options are mutually exclusive, or selecting more than one answer would blur the decision boundary. Typical cases: handoff lane selection, choosing the primary failure mode, or confirming which of several competing interpretations is correct.single-answerable - Use
when multiple options may all be true at once and you need to capture a bounded set of coexisting constraints, non-goals, risks, or acceptance checks in one round. Typical cases: selecting all out-of-scope items, all success metrics that must hold, or all deployment constraints that apply together.multi-answerable - If one selected option would immediately require a follow-up question to disambiguate the others, prefer a
round now and ask the follow-up next. Do not hide a branching interview tree inside one overloaded multi-select prompt.single-answerable - Keep interview options bounded and concrete. If the valid answers are already known, set
; only leaveallow_other: false
when the interview genuinely needs one user-supplied option that cannot be enumerated in advance.allow_other: true - Read answers structurally. For
, expect one decisive selection insingle-answerable
plusanswer.value
. Foranswer.selected_values
, treatmulti-answerable
as the source of truth for all chosen constraints/non-goals and preserve the full set in the transcript/spec.answer.selected_values
Canonical bounded single-choice payload:
{ "question": "Which execution lane should own this once the interview is complete?", "type": "single-answerable", "options": [ { "label": "Plan first", "value": "ralplan", "description": "Need architecture and test-shape review before execution" }, { "label": "Execute directly", "value": "autopilot", "description": "Requirements are already explicit enough for planning plus execution" }, { "label": "Refine further", "value": "refine", "description": "Clarification is still needed before any handoff" } ], "allow_other": false, "other_label": "Other", "source": "deep-interview" }
Canonical bounded multi-select payload:
{ "question": "Which non-goals must stay out of scope for the first pass?", "type": "multi-answerable", "options": [ { "label": "No UI redesign", "value": "no-ui-redesign", "description": "Keep layout and styling unchanged" }, { "label": "No new dependencies", "value": "no-new-dependencies", "description": "Work within the existing toolchain" }, { "label": "No API contract changes", "value": "no-api-contract-changes", "description": "Preserve external request and response shapes" } ], "allow_other": false, "other_label": "Other", "source": "deep-interview" }
Canonical answer-shape reminders:
{ "answer": { "kind": "option", "value": "ralplan", "selected_labels": ["Plan first"], "selected_values": ["ralplan"] } }
{ "answer": { "kind": "multi", "value": ["no-new-dependencies", "no-api-contract-changes"], "selected_labels": ["No new dependencies", "No API contract changes"], "selected_values": ["no-new-dependencies", "no-api-contract-changes"] } }
2c) Score ambiguity
Score each weighted dimension in
[0.0, 1.0] with justification + gap.
Greenfield:
ambiguity = 1 - (intent × 0.30 + outcome × 0.25 + scope × 0.20 + constraints × 0.15 + success × 0.10)
Brownfield:
ambiguity = 1 - (intent × 0.25 + outcome × 0.20 + scope × 0.20 + constraints × 0.15 + success × 0.10 + context × 0.10)
Readiness gate:
must be explicitNon-goals
must be explicitDecision Boundaries- A pressure pass must be complete: at least one earlier answer has been revisited with an evidence, assumption, or tradeoff follow-up
- If either gate is unresolved, or the pressure pass is incomplete, continue interviewing even when weighted ambiguity is below threshold
2d) Report progress
Show weighted breakdown table, readiness-gate status (
Non-goals, Decision Boundaries), and the next focus dimension.
2e) Persist state
Append round result and updated scores via
state_write.
2f) Round controls
- Do not offer early exit before the first explicit assumption probe and one persistent follow-up have happened
- Round 4+: allow explicit early exit with risk warning
- Soft warning at profile midpoint (e.g., round 3/6/10 depending on profile)
- Hard cap at profile
max_rounds
Phase 3: Challenge Modes (assumption stress tests)
Use each mode once when applicable. These are normal escalation tools, not rare rescue moves:
- Contrarian (round 2+ or immediately when an answer rests on an untested assumption): challenge core assumptions
- Simplifier (round 4+ or when scope expands faster than outcome clarity): probe minimal viable scope
- Ontologist (round 5+ and ambiguity > 0.25, or when the user keeps describing symptoms): ask for essence-level reframing
Track used modes in state to prevent repetition.
Phase 4: Crystallize Artifacts
When threshold is met (or user exits with warning / hard cap):
- Write interview transcript summary to:
.omx/interviews/{slug}-{timestamp}.md
(kept for ralph PRD compatibility)
- Write execution-ready spec to:
.omx/specs/deep-interview-{slug}.md
Spec should include:
- Metadata (profile, rounds, final ambiguity, threshold, context type)
- Context snapshot reference/path (for ralplan/team reuse)
- Clarity breakdown table
- Intent (why the user wants this)
- Desired Outcome
- In-Scope
- Out-of-Scope / Non-goals
- Decision Boundaries (what OMX may decide without confirmation)
- Constraints
- Testable acceptance criteria
- Assumptions exposed + resolutions
- Pressure-pass findings (which answer was revisited, and what changed)
- Brownfield evidence vs inference notes for any repository-grounded confirmation questions
- Technical context findings
- Full or condensed transcript
Autoresearch specialization
When the clarified task is specifically about
$autoresearch, or the skill is invoked with --autoresearch, keep the interview domain-specific and emit skill-consumable artifacts without skipping clarification.
- Accepted seed inputs:
,topic
,evaluator
,keep-policy
, existing mission draft text, and prior evaluator examples/templatesslug - Required interview focus: mission clarity, evaluator readiness, keep policy, slug/session naming, and whether the draft is ready to launch now or should refine further
- Canonical artifact path:
.omx/specs/deep-interview-autoresearch-{slug}.md - Launch artifact bundle:
,.omx/specs/autoresearch-{slug}/mission.md
, and.omx/specs/autoresearch-{slug}/sandbox.md.omx/specs/autoresearch-{slug}/result.json - Launch artifact directory:
.omx/specs/autoresearch-{slug}/ - Required artifact sections:
Mission DraftEvaluator DraftLaunch ReadinessSeed InputsConfirmation Bridge
- Required launch artifacts under
:.omx/specs/autoresearch-{slug}/mission.mdsandbox.mdresult.json
- Launch-readiness rule: mark the draft as not launch-ready while the evaluator command still contains placeholder markers such as
,<...>
,TODO
,TBD
,REPLACE_ME
, orCHANGEMEyour-command-here - Structured result contract:
should point to the draft + mission/sandbox artifacts and carry the finalizedresult.json
,topic
,evaluatorCommand
,keepPolicy
,slug
, andlaunchReady
fields soblockedReasons
can consume it directly$autoresearch - Confirmation bridge: after artifact generation, offer at least
andrefine further
; do not run direct CLI launch or detached/split tmux launch, and only hand off tolaunch
after explicit confirmation$autoresearch - Handoff rule: downstream execution must preserve the clarified mission intent, evaluator expectations, decision boundaries, and launch-readiness status from this artifact rather than bypassing the draft review step
Phase 5: Execution Bridge
Present execution options after artifact generation using explicit handoff contracts. Treat the deep-interview spec as the current requirements source of truth and preserve intent, non-goals, decision boundaries, acceptance criteria, and any residual-risk warnings across the handoff.
1. $ralplan
(Recommended)
$ralplan- Input Artifact:
(optionally accompanied by the transcript/context snapshot for traceability).omx/specs/deep-interview-{slug}.md - Invocation:
$plan --consensus --direct <spec-path> - Consumer Behavior: Treat the deep-interview spec as the requirements source of truth. Do not repeat the interview by default; refine architecture/feasibility around the clarified intent and boundaries instead.
- Skipped / Already-Satisfied Stages: Requirements discovery, ambiguity clarification, and early intent-boundary elicitation
- Expected Output: Canonical planning artifacts under
, especially.omx/plans/
andprd-*.mdtest-spec-*.md - Best When: Requirements are clear enough to stop interviewing, but architectural validation / consensus planning is still desirable
- Next Recommended Step: Use the approved planning artifacts with
,$autopilot
, or$ralph
depending on the desired execution style$team
2. $autopilot
$autopilot- Input Artifact:
.omx/specs/deep-interview-{slug}.md - Invocation:
$autopilot <spec-path> - Consumer Behavior: Use the deep-interview spec as the clarified execution brief. Preserve intent, non-goals, decision boundaries, and acceptance criteria as binding context for planning/execution.
- Skipped / Already-Satisfied Stages: Initial requirement discovery and ambiguity reduction
- Expected Output: Planning/execution progress, QA evidence, and validation artifacts produced by autopilot
- Best When: The clarified spec is already strong enough for direct planning + execution without an additional consensus gate
- Next Recommended Step: Continue through autopilot's execution/QA/validation flow; if coordination-heavy execution emerges, prefer a follow-up
or$team
lane as appropriate$ralph
3. $ralph
$ralph- Input Artifact:
.omx/specs/deep-interview-{slug}.md - Invocation:
$ralph <spec-path> - Consumer Behavior: Use the spec's acceptance criteria and boundary constraints as the persistence target. Do not reopen requirements discovery unless the user explicitly asks to refine further.
- Skipped / Already-Satisfied Stages: Requirement interview, ambiguity clarification, and initial scope-definition work
- Expected Output: Iterative execution progress and verification evidence tracked against the clarified criteria
- Best When: The task benefits from persistent sequential completion pressure and the user wants execution to keep moving until the criteria are satisfied or a real blocker exists
- Next Recommended Step: Continue Ralph's persistence loop; if work expands into coordination-heavy lanes, hand off to
and keep Ralph for verification continuity$team
4. $team
$team- Input Artifact:
.omx/specs/deep-interview-{slug}.md - Invocation:
$team <spec-path> - Consumer Behavior: Treat the spec as shared execution context for coordinated parallel work. Preserve the clarified intent, non-goals, decision boundaries, and acceptance criteria as common lane constraints.
- Skipped / Already-Satisfied Stages: Requirement clarification and early ambiguity reduction
- Expected Output: Coordinated multi-agent execution against the shared spec, with evidence that can later feed a Ralph verification pass when appropriate
- Best When: The task is large, multi-lane, or blocker-sensitive enough to justify coordinated parallel execution instead of a single persistent loop
- Next Recommended Step: Follow the team verification path when the coordinated execution phase finishes; escalate to a separate Ralph loop only when a later persistent verification/fix owner is still needed
5. Refine further
- Input Artifact: Existing transcript, context snapshot, and current spec draft
- Invocation: Continue the interview loop
- Consumer Behavior: Re-enter questioning to resolve the highest-leverage remaining uncertainty
- Skipped / Already-Satisfied Stages: None beyond already-captured context
- Expected Output: A lower-ambiguity spec with tighter boundaries and fewer unresolved assumptions
- Best When: Residual ambiguity is still too high, the user wants stronger clarity, or the above-threshold / early-exit warning indicates too much risk to proceed cleanly
- Next Recommended Step: Return to one of the execution handoff contracts above once the spec is sufficiently clarified
Residual-Risk Rule: If the interview ended via early exit, hard-cap completion, or above-threshold proceed-with-warning, explicitly preserve that residual-risk state in the handoff so the downstream skill knows it inherited a partially clarified brief.
IMPORTANT: Deep-interview is a requirements mode. On handoff, invoke the selected skill using the contract above. Do NOT implement directly inside deep-interview.
</Steps><Tool_Usage>
- Use
for codebase fact gatheringexplore - Use
as the OMX-native structured user-input tool for each interview roundomx question - If
is unavailable in the current runtime, stop and surface that deep-interview requires the OMX question tool rather than falling back to another questioning pathomx question - Use
/state_write
for resumable mode statestate_read - Read/write context snapshots under
.omx/context/ - Save transcript/spec artifacts under
and.omx/interviews/
</Tool_Usage>.omx/specs/
<Escalation_And_Stop_Conditions>
- User says stop/cancel/abort -> persist state and stop
- Ambiguity stalls for 3 rounds (+/- 0.05) -> force Ontologist mode once
- Max rounds reached -> proceed with explicit residual-risk warning
- All dimensions >= 0.9 -> allow early crystallization even before max rounds </Escalation_And_Stop_Conditions>
<Final_Checklist>
- Preflight context snapshot exists under
.omx/context/{slug}-{timestamp}.md - Ambiguity score shown each round
- Intent-first stage priority used before implementation detail
- Weakest-dimension targeting used within the active stage
- At least one explicit assumption probe happened before crystallization
- At least one persistent follow-up / pressure pass deepened a prior answer
- Challenge modes triggered at thresholds (when applicable)
- Transcript written to
.omx/interviews/{slug}-{timestamp}.md - Spec written to
.omx/specs/deep-interview-{slug}.md - Brownfield questions use evidence-backed confirmation when applicable
- Handoff options provided (
,$ralplan
,$autopilot
,$ralph
)$team - No direct implementation performed in this mode </Final_Checklist>
[omx.deepInterview] defaultProfile = "standard" quickThreshold = 0.30 standardThreshold = 0.20 deepThreshold = 0.15 quickMaxRounds = 5 standardMaxRounds = 12 deepMaxRounds = 20 enableChallengeModes = true
Resume
If interrupted, rerun
$deep-interview. Resume from persisted mode state via state_read(mode="deep-interview").
Recommended 3-Stage Pipeline
deep-interview -> ralplan -> autopilot
- Stage 1 (deep-interview): clarity gate
- Stage 2 (ralplan): feasibility + architecture gate
- Stage 3 (autopilot): execution + QA + validation gate </Advanced>
Task: {{ARGUMENTS}}