Auto-claude-code-research-in-sleep research-review

Get a deep critical review of research from GPT using a secondary Codex agent. Use when user says \"review my research\", \"help me review\", \"get external review\", or wants critical feedback on research ideas, papers, or experimental results.

install
source · Clone the upstream repo
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/skills-codex/research-review" ~/.claude/skills/wanshuiyin-auto-claude-code-research-in-sleep-research-review-2a4810 && rm -rf "$T"
manifest: skills/skills-codex/research-review/SKILL.md
source content

Research Review via a secondary Codex agent (xhigh reasoning)

Get a multi-round critical review of research work from an external LLM with maximum reasoning depth.

Constants

  • REVIEWER_MODEL =
    gpt-5.4
    — Model used via a secondary Codex agent. Must be an OpenAI model (e.g.,
    gpt-5.4
    ,
    o3
    ,
    gpt-4o
    )

Context: $ARGUMENTS

Prerequisites

  • Use
    spawn_agent
    and
    send_input
    when the user has explicitly allowed delegation or subagents.
  • If delegation is not allowed, run the same review loop locally and preserve the same deliverable structure.

Workflow

Step 1: Gather Research Context

Before calling the external reviewer, compile a comprehensive briefing:

  1. Read project narrative documents (e.g., STORY.md, README.md, paper drafts)
  2. Read any memory/notes files for key findings and experiment history
  3. Identify: core claims, methodology, key results, known weaknesses

Step 2: Initial Review (Round 1)

Send a detailed prompt with xhigh reasoning:

spawn_agent:
  reasoning_effort: xhigh
  message: |
    [Full research context + specific questions]
    Please act as a senior ML reviewer (NeurIPS/ICML level). Identify:
    1. Logical gaps or unjustified claims
    2. Missing experiments that would strengthen the story
    3. Narrative weaknesses
    4. Whether the contribution is sufficient for a top venue
    Please be brutally honest.

Step 3: Iterative Dialogue (Rounds 2-N)

Use

send_input
with the returned agent id to continue the conversation:

For each round:

  1. Respond to criticisms with evidence/counterarguments
  2. Ask targeted follow-ups on the most actionable points
  3. Request specific deliverables: experiment designs, paper outlines, claims matrices

Key follow-up patterns:

  • "If we reframe X as Y, does that change your assessment?"
  • "What's the minimum experiment to satisfy concern Z?"
  • "Please design the minimal additional experiment package (highest acceptance lift per GPU week)"
  • "Please write a mock NeurIPS/ICML review with scores"
  • "Give me a results-to-claims matrix for possible experimental outcomes"

Step 4: Convergence

Stop iterating when:

  • Both sides agree on the core claims and their evidence requirements
  • A concrete experiment plan is established
  • The narrative structure is settled

Step 5: Document Everything

Save the full interaction and conclusions to a review document in the project root:

  • Round-by-round summary of criticisms and responses
  • Final consensus on claims, narrative, and experiments
  • Claims matrix (what claims are allowed under each possible outcome)
  • Prioritized TODO list with estimated compute costs
  • Paper outline if discussed

Update project memory/notes with key review conclusions.

Key Rules

  • ALWAYS use
    reasoning_effort: xhigh
    for reviews
  • Send comprehensive context in Round 1 — the external model cannot read your files
  • Be honest about weaknesses — hiding them leads to worse feedback
  • Push back on criticisms you disagree with, but accept valid ones
  • Focus on ACTIONABLE feedback — "what experiment would fix this?"
  • Document the agent id for potential future resumption
  • The review document should be self-contained (readable without the conversation)

Prompt Templates

For initial review:

"I'm going to present a complete ML research project for your critical review. Please act as a senior ML reviewer (NeurIPS/ICML level)..."

For experiment design:

"Please design the minimal additional experiment package that gives the highest acceptance lift per GPU week. Our compute: [describe]. Be very specific about configurations."

For paper structure:

"Please turn this into a concrete paper outline with section-by-section claims and figure plan."

For claims matrix:

"Please give me a results-to-claims matrix: what claim is allowed under each possible outcome of experiments X and Y?"

For mock review:

"Please write a mock NeurIPS review with: Summary, Strengths, Weaknesses, Questions for Authors, Score, Confidence, and What Would Move Toward Accept."