Auto-claude-code-research-in-sleep research-review
Get a deep critical review of research from GPT using a secondary Codex agent. Use when user says \"review my research\", \"help me review\", \"get external review\", or wants critical feedback on research ideas, papers, or experimental results.
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep
T=$(mktemp -d) && git clone --depth=1 https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/skills-codex/research-review" ~/.claude/skills/wanshuiyin-auto-claude-code-research-in-sleep-research-review-2a4810 && rm -rf "$T"
skills/skills-codex/research-review/SKILL.mdResearch Review via a secondary Codex agent (xhigh reasoning)
Get a multi-round critical review of research work from an external LLM with maximum reasoning depth.
Constants
- REVIEWER_MODEL =
— Model used via a secondary Codex agent. Must be an OpenAI model (e.g.,gpt-5.4
,gpt-5.4
,o3
)gpt-4o
Context: $ARGUMENTS
Prerequisites
- Use
andspawn_agent
when the user has explicitly allowed delegation or subagents.send_input - If delegation is not allowed, run the same review loop locally and preserve the same deliverable structure.
Workflow
Step 1: Gather Research Context
Before calling the external reviewer, compile a comprehensive briefing:
- Read project narrative documents (e.g., STORY.md, README.md, paper drafts)
- Read any memory/notes files for key findings and experiment history
- Identify: core claims, methodology, key results, known weaknesses
Step 2: Initial Review (Round 1)
Send a detailed prompt with xhigh reasoning:
spawn_agent: reasoning_effort: xhigh message: | [Full research context + specific questions] Please act as a senior ML reviewer (NeurIPS/ICML level). Identify: 1. Logical gaps or unjustified claims 2. Missing experiments that would strengthen the story 3. Narrative weaknesses 4. Whether the contribution is sufficient for a top venue Please be brutally honest.
Step 3: Iterative Dialogue (Rounds 2-N)
Use
send_input with the returned agent id to continue the conversation:
For each round:
- Respond to criticisms with evidence/counterarguments
- Ask targeted follow-ups on the most actionable points
- Request specific deliverables: experiment designs, paper outlines, claims matrices
Key follow-up patterns:
- "If we reframe X as Y, does that change your assessment?"
- "What's the minimum experiment to satisfy concern Z?"
- "Please design the minimal additional experiment package (highest acceptance lift per GPU week)"
- "Please write a mock NeurIPS/ICML review with scores"
- "Give me a results-to-claims matrix for possible experimental outcomes"
Step 4: Convergence
Stop iterating when:
- Both sides agree on the core claims and their evidence requirements
- A concrete experiment plan is established
- The narrative structure is settled
Step 5: Document Everything
Save the full interaction and conclusions to a review document in the project root:
- Round-by-round summary of criticisms and responses
- Final consensus on claims, narrative, and experiments
- Claims matrix (what claims are allowed under each possible outcome)
- Prioritized TODO list with estimated compute costs
- Paper outline if discussed
Update project memory/notes with key review conclusions.
Key Rules
- ALWAYS use
for reviewsreasoning_effort: xhigh - Send comprehensive context in Round 1 — the external model cannot read your files
- Be honest about weaknesses — hiding them leads to worse feedback
- Push back on criticisms you disagree with, but accept valid ones
- Focus on ACTIONABLE feedback — "what experiment would fix this?"
- Document the agent id for potential future resumption
- The review document should be self-contained (readable without the conversation)
Prompt Templates
For initial review:
"I'm going to present a complete ML research project for your critical review. Please act as a senior ML reviewer (NeurIPS/ICML level)..."
For experiment design:
"Please design the minimal additional experiment package that gives the highest acceptance lift per GPU week. Our compute: [describe]. Be very specific about configurations."
For paper structure:
"Please turn this into a concrete paper outline with section-by-section claims and figure plan."
For claims matrix:
"Please give me a results-to-claims matrix: what claim is allowed under each possible outcome of experiments X and Y?"
For mock review:
"Please write a mock NeurIPS review with: Summary, Strengths, Weaknesses, Questions for Authors, Score, Confidence, and What Would Move Toward Accept."