Auto-claude-code-research-in-sleep idea-discovery
Workflow 1: Full idea discovery pipeline. Orchestrates research-lit \u2192 idea-creator \u2192 novelty-check \u2192 research-review to go from a broad research direction to validated, pilot-tested ideas. Use when user says \\\"\u627eidea\u5168\u6d41\u7a0b\\\", \\\"idea discovery pipeline\\\", \\\"\u4ece\u96f6\u5f00\u59cb\u627e\u65b9\u5411\\\", or wants the complete idea exploration workflow.
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep
T=$(mktemp -d) && git clone --depth=1 https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/skills-codex-gemini-review/idea-discovery" ~/.claude/skills/wanshuiyin-auto-claude-code-research-in-sleep-idea-discovery-2cb4ab && rm -rf "$T"
skills/skills-codex-gemini-review/idea-discovery/SKILL.mdOverride for Codex users who want Gemini, not a second Codex agent, to act as the reviewer. Install this package after
.skills/skills-codex/*
Workflow 1: Idea Discovery Pipeline
Orchestrate a complete idea discovery workflow for: $ARGUMENTS
Overview
This skill chains sub-skills into a single automated pipeline:
/research-lit → /idea-creator → /novelty-check → /research-review → /research-refine-pipeline (survey) (brainstorm) (verify novel) (critical feedback) (refine method + plan experiments)
Each phase builds on the previous one's output. The final deliverables are a validated
idea-stage/IDEA_REPORT.md with ranked ideas, plus a refined proposal (refine-logs/FINAL_PROPOSAL.md) and experiment plan (refine-logs/EXPERIMENT_PLAN.md) for the top idea.
Constants
- PILOT_MAX_HOURS = 2 — Skip any pilot experiment estimated to take > 2 hours per GPU. Flag as "needs manual pilot" in the report.
- PILOT_TIMEOUT_HOURS = 3 — Hard timeout: kill any running pilot that exceeds 3 hours. Collect partial results if available.
- MAX_PILOT_IDEAS = 3 — Run pilots for at most 3 top ideas in parallel. Additional ideas are validated on paper only.
- MAX_TOTAL_GPU_HOURS = 8 — Total GPU budget across all pilots. If exceeded, skip remaining pilots and note in report.
- AUTO_PROCEED = true — If user doesn't respond at a checkpoint, automatically proceed with the best option after presenting results. Set to
to always wait for explicit user confirmation.false - OUTPUT_DIR =
— All idea-stage outputs go here. Create the directory if it doesn't exist.idea-stage/ - REVIEWER_MODEL =
— Gemini reviewer invoked through the localgemini-review
MCP bridge. Passed to the reviewer-aware sub-skills installed by this overlay.gemini-review - ARXIV_DOWNLOAD = false — When
,true
downloads the top relevant arXiv PDFs during Phase 1. When/research-lit
(default), only fetches metadata. Passed through tofalse
./research-lit
💡 These are defaults. Override by telling the skill, e.g.,
or/idea-discovery "topic" — pilot budget: 4h per idea, 20h total./idea-discovery "topic" — arxiv download: true
Pipeline
Phase 1: Literature Survey
Invoke
/research-lit to map the research landscape:
/research-lit "$ARGUMENTS"
What this does:
- Search arXiv, Google Scholar, Semantic Scholar for recent papers
- Build a landscape map: sub-directions, approaches, open problems
- Identify structural gaps and recurring limitations
- Output a literature summary (saved to working notes)
🚦 Checkpoint: Present the landscape summary to the user. Ask:
📚 Literature survey complete. Here's what I found: - [key findings, gaps, open problems] Does this match your understanding? Should I adjust the scope before generating ideas? (If no response, I'll proceed with the top-ranked direction.)
- User approves (or no response + AUTO_PROCEED=true) → proceed to Phase 2 with best direction.
- User requests changes (e.g., "focus more on X", "ignore Y", "too broad") → refine the search with updated queries, re-run
with adjusted scope, and present again. Repeat until the user is satisfied./research-lit
Phase 2: Idea Generation + Filtering + Pilots
Invoke
/idea-creator with the landscape context:
/idea-creator "$ARGUMENTS"
What this does:
- Brainstorm 8-12 concrete ideas via the Gemini-backed
overlay/idea-creator - Filter by feasibility, compute cost, quick novelty search
- Deep validate top ideas (full novelty check + devil's advocate)
- Run parallel pilot experiments on available GPUs (top 2-3 ideas)
- Rank by empirical signal
- Output
idea-stage/IDEA_REPORT.md
🚦 Checkpoint: Present
idea-stage/IDEA_REPORT.md ranked ideas to the user. Ask:
💡 Generated X ideas, filtered to Y, piloted Z. Top results: 1. [Idea 1] — Pilot: POSITIVE (+X%) 2. [Idea 2] — Pilot: WEAK POSITIVE (+Y%) 3. [Idea 3] — Pilot: NEGATIVE, eliminated Which ideas should I validate further? Or should I regenerate with different constraints? (If no response, I'll proceed with the top-ranked ideas.)
- User picks ideas (or no response + AUTO_PROCEED=true) → proceed to Phase 3 with top-ranked ideas.
- User unhappy with all ideas → collect feedback ("what's missing?", "what direction do you prefer?"), update the prompt with user's constraints, and re-run Phase 2 (idea generation). Repeat until the user selects at least 1 idea.
- User wants to adjust scope → go back to Phase 1 with refined direction.
Phase 3: Deep Novelty Verification
For each top idea (positive pilot signal), run a thorough novelty check:
/novelty-check "[top idea 1 description]" /novelty-check "[top idea 2 description]"
What this does:
- Multi-source literature search (arXiv, Scholar, Semantic Scholar)
- Cross-verify with the Gemini-backed
overlay/novelty-check - Check for concurrent work (last 3-6 months)
- Identify closest existing work and differentiation points
Update
with deep novelty results. Eliminate any idea that turns out to be already published.idea-stage/IDEA_REPORT.md
Phase 4: External Critical Review
For the surviving top idea(s), get brutal feedback:
/research-review "[top idea with hypothesis + pilot results]"
What this does:
- Gemini acts as a senior reviewer (NeurIPS/ICML level) via the local
MCP bridgegemini-review - Scores the idea, identifies weaknesses, suggests minimum viable improvements
- Provides concrete feedback on experimental design
Update
with reviewer feedback and revised plan.idea-stage/IDEA_REPORT.md
Phase 4.5: Method Refinement + Experiment Planning
After review, refine the top idea into a concrete proposal and plan experiments:
/research-refine-pipeline "[top idea description + pilot results + reviewer feedback]"
What this does:
- Freeze a Problem Anchor to prevent scope drift
- Iteratively refine the method via Gemini review (up to 5 rounds, until score ≥ 9)
- Generate a claim-driven experiment roadmap with ablations, budgets, and run order
- Output:
,refine-logs/FINAL_PROPOSAL.md
,refine-logs/EXPERIMENT_PLAN.mdrefine-logs/EXPERIMENT_TRACKER.md
🚦 Checkpoint: Present the refined proposal summary:
🔬 Method refined and experiment plan ready: - Problem anchor: [anchored problem] - Method thesis: [one sentence] - Dominant contribution: [what's new] - Must-run experiments: [N blocks] - First 3 runs to launch: [list] Proceed to implementation? Or adjust the proposal?
- User approves (or AUTO_PROCEED=true) → proceed to Final Report.
- User requests changes → pass feedback to
for another round./research-refine - Lite mode: If reviewer score < 6 or pilot was weak, run
only (skip/research-refine
) and note remaining risks in the report./experiment-plan
Phase 5: Final Report
Finalize
idea-stage/IDEA_REPORT.md with all accumulated information:
# Idea Discovery Report **Direction**: $ARGUMENTS **Date**: [today] **Pipeline**: research-lit → idea-creator → novelty-check → research-review → research-refine-pipeline ## Executive Summary [2-3 sentences: best idea, key evidence, recommended next step] ## Literature Landscape [from Phase 1] ## Ranked Ideas [from Phase 2, updated with Phase 3-4 results] ### 🏆 Idea 1: [title] — RECOMMENDED - Pilot: POSITIVE (+X%) - Novelty: CONFIRMED (closest: [paper], differentiation: [what's different]) - Reviewer score: X/10 - Next step: implement full experiment → /auto-review-loop ### Idea 2: [title] — BACKUP ... ## Eliminated Ideas [ideas killed at each phase, with reasons] ## Refined Proposal - Proposal: `refine-logs/FINAL_PROPOSAL.md` - Experiment plan: `refine-logs/EXPERIMENT_PLAN.md` - Tracker: `refine-logs/EXPERIMENT_TRACKER.md` ## Next Steps - [ ] /run-experiment to deploy experiments from the plan - [ ] /auto-review-loop to iterate until submission-ready - [ ] Or invoke /research-pipeline for the complete end-to-end flow
Output Protocols
Follow these shared protocols for all output files:
- Output Versioning Protocol — write timestamped file first, then copy to fixed name
- Output Manifest Protocol — log every output to MANIFEST.md
- Output Language Protocol — respect the project's language setting
Key Rules
-
Large file handling: If the Write tool fails due to file size, immediately retry using Bash (
) to write in chunks. Do NOT ask the user for permission — just do it silently.cat << 'EOF' > file -
Don't skip phases. Each phase filters and validates — skipping leads to wasted effort later.
-
Checkpoint between phases. Briefly summarize what was found before moving on.
-
Kill ideas early. It's better to kill 10 bad ideas in Phase 3 than to implement one and fail.
-
Empirical signal > theoretical appeal. An idea with a positive pilot outranks a "sounds great" idea without evidence.
-
Document everything. Dead ends are just as valuable as successes for future reference.
-
Be honest with the reviewer. Include negative results and failed pilots in the review prompt.
-
Feishu notifications are optional. If
exists, send~/.codex/feishu.json
at each phase transition andcheckpoint
at final report. If absent/off, skip silently.pipeline_done
Composing with Workflow 2
After this pipeline produces a validated top idea:
/idea-discovery "direction" ← you are here (Workflow 1, includes method refinement + experiment planning) /run-experiment ← deploy experiments from the plan /auto-review-loop "top idea" ← Workflow 2: iterate until submission-ready Or use /research-pipeline for the full end-to-end flow.