git clone https://github.com/Yeachan-Heo/oh-my-codex
T=$(mktemp -d) && git clone --depth=1 https://github.com/Yeachan-Heo/oh-my-codex "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/autopilot" ~/.claude/skills/yeachan-heo-oh-my-codex-autopilot && rm -rf "$T"
skills/autopilot/SKILL.md<Use_When>
- User wants end-to-end autonomous execution from an idea to working code
- User says "autopilot", "auto pilot", "autonomous", "build me", "create me", "make me", "full auto", "handle it all", or "I want a/an..."
- Task requires multiple phases: planning, coding, testing, and validation
- User wants hands-off execution and is willing to let the system run to completion </Use_When>
<Do_Not_Use_When>
- User wants to explore options or brainstorm -- use
skill insteadplan - User says "just explain", "draft only", or "what would you suggest" -- respond conversationally
- User wants a single focused code change -- use
or delegate to an executor agentralph - User wants to review or critique an existing plan -- use
plan --review - Task is a quick fix or small bug -- use direct executor delegation </Do_Not_Use_When>
<Why_This_Exists> Most non-trivial software tasks require coordinated phases: understanding requirements, designing a solution, implementing in parallel, testing, and validating quality. Autopilot orchestrates all of these phases automatically so the user can describe what they want and receive working code without managing each step. </Why_This_Exists>
<Execution_Policy>
- Each phase must complete before the next begins
- Parallel execution is used within phases where possible (Phase 2 and Phase 4)
- QA cycles repeat up to 5 times; if the same error persists 3 times, stop and report the fundamental issue
- Validation requires approval from all reviewers; rejected items get fixed and re-validated
- Cancel with
at any time; progress is preserved for resume/cancel - If a deep-interview spec exists, use it as high-clarity phase input instead of re-expanding from scratch
- If input is too vague for reliable expansion, offer/trigger
first$deep-interview - Do not enter expansion/planning/execution-heavy phases until pre-context grounding exists; if fast execution is forced, proceed only with explicit risk notes
- Default to concise, evidence-dense progress and completion reporting unless the user or risk level requires more detail
- Treat newer user task updates as local overrides for the active workflow branch while preserving earlier non-conflicting constraints
- If correctness depends on additional inspection, retrieval, execution, or verification, keep using the relevant tools until the workflow is grounded
- Continue through clear, low-risk, reversible next steps automatically; ask only when the next step is materially branching, destructive, or preference-dependent </Execution_Policy>
-
Phase 0 - Expansion: Turn the user's idea into a detailed spec
- If
exists for this task: reuse it and skip redundant expansion work.omx/specs/deep-interview-*.md - If prompt is highly vague: route to
for Socratic ambiguity-gated clarification$deep-interview - Analyst (THOROUGH tier): Extract requirements
- Architect (THOROUGH tier): Create technical specification
- Output:
.omx/plans/autopilot-spec.md
- If
-
Phase 1 - Planning: Create an implementation plan from the spec
- Architect (THOROUGH tier): Create plan (direct mode, no interview)
- Critic (THOROUGH tier): Validate plan
- Output:
.omx/plans/autopilot-impl.md
-
Phase 2 - Execution: Implement the plan using Ralph + Ultrawork
- LOW-tier executor/search roles: Simple tasks
- STANDARD-tier executor roles: Standard tasks
- THOROUGH-tier executor/architect roles: Complex tasks
- Run independent tasks in parallel
-
Phase 3 - QA: Cycle until all tests pass (UltraQA mode)
- Build, lint, test, fix failures
- Repeat up to 5 cycles
- Stop early if the same error repeats 3 times (indicates a fundamental issue)
-
Phase 4 - Validation: Multi-perspective review in parallel
- Architect: Functional completeness
- Security-reviewer: Vulnerability check
- Code-reviewer: Quality review
- All must approve; fix and re-validate on rejection
-
Phase 5 - Cleanup: Clear all mode state via OMX MCP tools on successful completion
state_clear({mode: "autopilot"})state_clear({mode: "ralph"})state_clear({mode: "ultrawork"})state_clear({mode: "ultraqa"})- Or run
for clean exit </Steps>/cancel
<Tool_Usage>
- Before first MCP tool use, call
to discover deferred MCP toolsToolSearch("mcp") - Use
withask_codex
for Phase 4 architecture validationagent_role: "architect" - Use
withask_codex
for Phase 4 security reviewagent_role: "security-reviewer" - Use
withask_codex
for Phase 4 quality reviewagent_role: "code-reviewer" - Agents form their own analysis first, then consult Codex for cross-validation
- If ToolSearch finds no MCP tools or Codex is unavailable, proceed without it -- never block on external tools </Tool_Usage>
State Management
Use
omx_state MCP tools for autopilot lifecycle state.
- On start:
state_write({mode: "autopilot", active: true, current_phase: "expansion", started_at: "<now>", state: {context_snapshot_path: "<snapshot-path>"}}) - On phase transitions:
state_write({mode: "autopilot", current_phase: "planning"})state_write({mode: "autopilot", current_phase: "execution"})state_write({mode: "autopilot", current_phase: "qa"})state_write({mode: "autopilot", current_phase: "validation"}) - On completion:
state_write({mode: "autopilot", active: false, current_phase: "complete", completed_at: "<now>"}) - On cancellation/cleanup:
run
(which should call$cancel
)state_clear(mode="autopilot")
Scenario Examples
Good: The user says
continue after the workflow already has a clear next step. Continue the current branch of work instead of restarting or re-asking the same question.
Good: The user changes only the output shape or downstream delivery step (for example
make a PR). Preserve earlier non-conflicting workflow constraints and apply the update locally.
Bad: The user says
continue, and the workflow restarts discovery or stops before the missing verification/evidence is gathered.
<Examples>
<Good>
User: "autopilot A REST API for a bookstore inventory with CRUD operations using TypeScript"
Why good: Specific domain (bookstore), clear features (CRUD), technology constraint (TypeScript). Autopilot has enough context to expand into a full spec.
</Good>
<Good>
User: "build me a CLI tool that tracks daily habits with streak counting"
Why good: Clear product concept with a specific feature. The "build me" trigger activates autopilot.
</Good>
<Bad>
User: "fix the bug in the login page"
Why bad: This is a single focused fix, not a multi-phase project. Use direct executor delegation or ralph instead.
</Bad>
<Bad>
User: "what are some good approaches for adding caching?"
Why bad: This is an exploration/brainstorming request. Respond conversationally or use the plan skill.
</Bad>
</Examples>
<Escalation_And_Stop_Conditions>
- Stop and report when the same QA error persists across 3 cycles (fundamental issue requiring human input)
- Stop and report when validation keeps failing after 3 re-validation rounds
- Stop when the user says "stop", "cancel", or "abort"
- If requirements were too vague and expansion produces an unclear spec, pause and redirect to
before proceeding </Escalation_And_Stop_Conditions>$deep-interview
<Final_Checklist>
- All 5 phases completed (Expansion, Planning, Execution, QA, Validation)
- All validators approved in Phase 4
- Tests pass (verified with fresh test run output)
- Build succeeds (verified with fresh build output)
- State files cleaned up
- User informed of completion with summary of what was built </Final_Checklist>
Optional settings in
~/.codex/config.toml:
[omx.autopilot] maxIterations = 10 maxQaCycles = 5 maxValidationRounds = 3 pauseAfterExpansion = false pauseAfterPlanning = false skipQa = false skipValidation = false
Resume
If autopilot was cancelled or failed, run
/autopilot again to resume from where it stopped.
Recommended Clarity Pipeline
For ambiguous requests, prefer:
deep-interview -> ralplan -> autopilot
: ambiguity-gated Socratic requirementsdeep-interview
: consensus planning (planner/architect/critic)ralplan
: execution + QA + validationautopilot
Best Practices for Input
- Be specific about the domain -- "bookstore" not "store"
- Mention key features -- "with CRUD", "with authentication"
- Specify constraints -- "using TypeScript", "with PostgreSQL"
- Let it run -- avoid interrupting unless truly needed
Pipeline Orchestrator (v0.8+)
Autopilot can be driven by the configurable pipeline orchestrator (
src/pipeline/), which
sequences stages through a uniform PipelineStage interface:
RALPLAN (consensus planning) -> team-exec (Codex CLI workers) -> ralph-verify (architect verification)
Pipeline configuration options:
[omx.autopilot.pipeline] maxRalphIterations = 10 # Ralph verification iteration ceiling workerCount = 2 # Number of Codex CLI team workers agentType = "executor" # Agent type for team workers
The pipeline persists state via
pipeline-state.json and supports resume from the last
incomplete stage. See src/pipeline/orchestrator.ts for the full API.
Troubleshooting
Stuck in a phase? Check TODO list for blocked tasks, run
state_read({mode: "autopilot"}), or cancel and resume.
QA cycles exhausted? The same error 3 times indicates a fundamental issue. Review the error pattern; manual intervention may be needed.
Validation keeps failing? Review the specific issues. Requirements may have been too vague -- cancel and provide more detail. </Advanced>