Auto-claude-code-research-in-sleep rebuttal
Workflow 4: Submission rebuttal pipeline. Parses external reviews, enforces coverage and grounding, drafts a safe text-only rebuttal under venue limits, and manages follow-up rounds. Use when user says \"rebuttal\", \"reply to reviewers\", \"ICML rebuttal\", \"OpenReview response\", or wants to answer external reviews safely.
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep
T=$(mktemp -d) && git clone --depth=1 https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/rebuttal" ~/.claude/skills/wanshuiyin-auto-claude-code-research-in-sleep-rebuttal && rm -rf "$T"
skills/rebuttal/SKILL.mdWorkflow 4: Rebuttal
Prepare and maintain a grounded, venue-compliant rebuttal for: $ARGUMENTS
Scope
This skill is optimized for:
- ICML-style text-only rebuttal
- strict character limits
- multiple reviewers
- follow-up rounds after the initial rebuttal
- safe drafting with no fabrication, no overpromise, and full issue coverage
This skill does not:
- run new experiments automatically
- generate new theorem claims automatically
- edit or upload a revised PDF
- submit to OpenReview / CMT / HotCRP
If the user already has new results, derivations, or approved commitments, the skill can incorporate them as user-confirmed evidence.
Lifecycle Position
Workflow 1: idea-discovery Workflow 1.5: experiment-bridge Workflow 2: auto-review-loop (pre-submission) Workflow 3: paper-writing Workflow 4: rebuttal (post-submission external reviews)
Constants
- VENUE =
— Default venue. Override if needed.ICML - RESPONSE_MODE =
— v1 default.TEXT_ONLY - REVIEWER_MODEL =
— Used via Codex MCP for internal stress-testing.gpt-5.4 - REVIEWER_BACKEND =
— Default: Codex MCP (xhigh). Override withcodex
for GPT-5.4 Pro via Oracle MCP. See— reviewer: oracle-pro
.shared-references/reviewer-routing.md - MAX_INTERNAL_DRAFT_ROUNDS = 2 — draft → lint → revise.
- MAX_STRESS_TEST_ROUNDS = 1 — One Codex MCP critique round.
- MAX_FOLLOWUP_ROUNDS = 3 — per reviewer thread.
- AUTO_EXPERIMENT = false — When
, automatically invoketrue
to run supplementary experiments when the strategy plan identifies reviewer concerns that require new empirical evidence. When/experiment-bridge
(default), pause and present the evidence gap to the user for manual handling.false - QUICK_MODE = false — When
, only run Phase 0-3 (parse reviews, atomize concerns, build strategy). Outputstrue
+ISSUE_BOARD.md
and stops — no drafting, no stress test. Useful for quickly understanding what reviewers want before deciding how to respond.STRATEGY_PLAN.md - REBUTTAL_DIR =
rebuttal/
Override:
/rebuttal "paper/" — venue: NeurIPS, character limit: 5000
Required Inputs
- Paper source — PDF, LaTeX directory, or narrative summary
- Raw reviews — pasted text, markdown, or PDF with reviewer IDs
- Venue rules — venue name, character/word limit, text-only or revised PDF allowed
- Current stage — initial rebuttal or follow-up round
If venue rules or limit are missing, stop and ask before drafting.
Safety Model
Three hard gates — if any fails, do NOT finalize:
- Provenance gate — every factual statement maps to:
,paper
,review
,user_confirmed_result
, oruser_confirmed_derivation
. No source = blocked.future_work - Commitment gate — every promise maps to:
,already_done
, orapproved_for_rebuttal
. Not approved = blocked.future_work_only - Coverage gate — every reviewer concern ends in:
,answered
, ordeferred_intentionally
. No issue disappears.needs_user_input
Workflow
Phase 0: Resume or Initialize
- If
exists → resume from recorded phaserebuttal/REBUTTAL_STATE.md - Otherwise → create
, initialize all output documentsrebuttal/ - Load paper, reviews, venue rules, any user-confirmed evidence
Phase 1: Validate Inputs and Normalize Reviews
- Validate venue rules are explicit
- Normalize all reviewer text into
(verbatim)rebuttal/REVIEWS_RAW.md - Record metadata in
rebuttal/REBUTTAL_STATE.md - If ambiguous, pause and ask
Phase 2: Atomize and Classify Reviewer Concerns
Create
rebuttal/ISSUE_BOARD.md.
For each atomic concern:
(e.g., R1-C2)issue_id
,reviewer
,round
(short quote)raw_anchor
: assumptions / theorem_rigor / novelty / empirical_support / baseline_comparison / complexity / practical_significance / clarity / reproducibility / otherissue_type
: critical / major / minorseverity
: positive / swing / negative / unknownreviewer_stance
: direct_clarification / grounded_evidence / nearest_work_delta / assumption_hierarchy / narrow_concession / future_work_boundaryresponse_mode
: open / answered / deferred / needs_user_inputstatus
Phase 3: Build Strategy Plan
Create
rebuttal/STRATEGY_PLAN.md.
- Identify 2-4 global themes resolving shared concerns
- Choose response mode per issue
- Build character budget (10-15% opener, 75-80% per-reviewer, 5-10% closing)
- Identify blocked claims (ungrounded or unapproved)
- If unresolved blockers → pause and present to user
QUICK_MODE exit: If
QUICK_MODE = true, stop here. Present ISSUE_BOARD.md + STRATEGY_PLAN.md to the user and summarize: how many issues per reviewer, shared vs unique concerns, recommended priorities, and evidence gaps. The user can then decide to continue with full rebuttal (/rebuttal — quick mode: false) or write manually.
Phase 3.5: Evidence Sprint (when AUTO_EXPERIMENT = true)
Skip entirely if
is AUTO_EXPERIMENT
— instead, pause and present the evidence gaps to the user.false
If the strategy plan identifies issues that require new empirical evidence (tagged
response_mode: grounded_evidence with evidence_source: needs_experiment):
-
Generate a mini experiment plan from the reviewer concerns:
- What to run (ablation, baseline comparison, scale-up, condition check)
- Success criterion (what result would satisfy the reviewer)
- Estimated GPU-hours
-
Invoke
with the mini plan:/experiment-bridge/experiment-bridge "rebuttal/REBUTTAL_EXPERIMENT_PLAN.md" -
Wait for results, then update
:ISSUE_BOARD.md- Tag completed experiments as
user_confirmed_result - Update evidence source for relevant issue cards
- Tag completed experiments as
-
If experiments fail or are inconclusive:
- Switch response mode to
ornarrow_concessionfuture_work_boundary - Do NOT fabricate positive results
- Switch response mode to
-
Save experiment results to
for provenance tracking.rebuttal/REBUTTAL_EXPERIMENTS.md
Time guard: If estimated GPU-hours exceed rebuttal deadline, skip and flag for manual handling.
Phase 4: Draft Initial Rebuttal
Create
rebuttal/REBUTTAL_DRAFT_v1.md.
Structure:
- Short opener — thank reviewers + 2-4 global resolutions
- Per-reviewer numbered responses — answer → evidence → implication
- Short closing — resolved / remaining / acceptance case
Default reply pattern per issue:
- Sentence 1: direct answer
- Sentence 2-4: grounded evidence
- Last sentence: implication for the paper
Heuristics from 5 successful rebuttals:
- Evidence > assertion
- Global narrative first, per-reviewer detail second
- Concrete numbers for counter-intuitive points
- Name closest prior work + exact delta for novelty disputes
- Concede narrowly when reviewer is right
- For theory: separate core vs technical assumptions
- Answer friendly reviewers too
Hard rules:
- NEVER invent experiments, numbers, derivations, citations, or links
- NEVER promise what user hasn't approved
- If no strong evidence exists, say less not more
Also generate
rebuttal/PASTE_READY.txt (plain text, exact character count).
Also generate
rebuttal/REVISION_PLAN.md — the overall revision checklist.
This document is the single source of truth for every paper revision promised (explicitly or implicitly) in the rebuttal draft. It exists so the author can track follow-through after the rebuttal is submitted, and so the commitment gate in Phase 5 has a concrete artifact to validate against.
Structure:
-
Header
- Paper title, venue, character limit, rebuttal round
- Links back to
,ISSUE_BOARD.md
,STRATEGY_PLAN.mdREBUTTAL_DRAFT_v1.md
-
Overall checklist — a single flat GitHub-style checklist covering every revision item, so the author can tick items off as they land in the camera-ready / revised PDF:
## Overall Checklist - [ ] (R1-C2) Add assumption hierarchy table to Section 3.1 — commitment: `approved_for_rebuttal` — owner: author — status: pending - [ ] (R2-C1) Clarify novelty delta vs. Smith'24 in Section 2 related work — commitment: `already_done` — status: verify wording - [ ] (R3-C4) Add runtime breakdown figure to Appendix B — commitment: `future_work_only` — status: deferred, note in camera-ready - ...Checklist items must be atomic (one paper edit per line) and each must reference its
so it maps back toissue_id
.ISSUE_BOARD.md -
Grouped view — the same items regrouped by (a) paper section/location and (b) severity, so the author can plan the revision pass efficiently.
-
Commitment summary — counts of
/already_done
/approved_for_rebuttal
, plus anyfuture_work_only
items that are blocking.needs_user_input -
Out-of-scope log — reviewer concerns that will not trigger a paper revision (e.g.
,deferred_intentionally
with no edit), with a one-line reason each. This keeps the checklist honest: nothing silently disappears.narrow_concession
Rules for
REVISION_PLAN.md:
- Every checklist item must map to at least one
fromissue_id
.ISSUE_BOARD.md - Every promise in
that implies a paper edit must appear as a checklist item — if it is not in the plan, it is a commitment-gate violation.REBUTTAL_DRAFT_v1.md - Never add items that are not backed by the draft or by user-confirmed evidence.
- On rerun / follow-up rounds, update checkbox state in place rather than regenerating from scratch.
Phase 5: Safety Validation
Run all lints:
- Coverage — every issue maps to draft anchor
- Provenance — every factual sentence has source
- Commitment — promises are approved AND every paper-edit promise in the draft appears as a checklist item in
(and vice versa — no orphan items in the plan)REVISION_PLAN.md - Tone — flag aggressive/submissive/evasive phrases
- Consistency — no contradictions across reviewer replies
- Limit — exact character count, compress if over (redundancy → friendly → opener → wording, never drop critical answers)
Phase 6: Codex MCP Stress Test
mcp__codex__codex: config: {"model_reasoning_effort": "xhigh"} prompt: | Stress-test this rebuttal draft: [raw reviews + issue board + draft + venue rules] 1. Unanswered or weakly answered concerns? 2. Unsupported factual statements? 3. Risky or unapproved promises? 4. Tone problems? 5. Paragraph most likely to backfire with meta-reviewer? 6. Minimal grounded fixes only. Do NOT invent evidence. Verdict: safe to submit / needs revision
Save full response to
rebuttal/MCP_STRESS_TEST.md. If hard safety blocker → revise before finalizing.
Phase 7: Finalize — Two Versions
Produce two outputs for different purposes:
-
— the strict versionrebuttal/PASTE_READY.txt- Plain text, exact character count, fits venue limit
- Ready to paste directly into OpenReview / CMT / HotCRP
- No markdown formatting, no extras
-
— the extended versionrebuttal/REBUTTAL_DRAFT_rich.md- Same structure but with more detail: fuller explanations, additional evidence, optional paragraphs
- Marked with
for sections that exceed the strict version[OPTIONAL — cut if over limit] - Author can read this to understand the full reasoning, then manually decide what to keep/cut/rewrite
- Useful for follow-up rounds — the extra material is pre-written
-
Update
rebuttal/REBUTTAL_STATE.md -
Refresh
so the overall checklist matches the final draft (add items, markrebuttal/REVISION_PLAN.md
as checked, carry forward anyalready_done
items)pending -
Present to user:
character count vs venue limitPASTE_READY.txt
for review and manual editingREBUTTAL_DRAFT_rich.md
checklist — counts of pending / approved / deferredREVISION_PLAN.md- Remaining risks + lines needing manual approval
Phase 8: Follow-Up Rounds
When new reviewer comments arrive:
- Append verbatim to
rebuttal/FOLLOWUP_LOG.md - Link to existing issues or create new ones
- Draft delta reply only (not full rewrite)
- Update
in place — add any new checklist items introduced by the follow-up, tick off items the author has already completed, and keep existing items' status currentrebuttal/REVISION_PLAN.md - Re-run safety lints
- Use Codex MCP reply for continuity if useful
- Rules: escalate technically not rhetorically; concede if reviewer is correct; stop arguing if reviewer is immovable and no new evidence exists
Key Rules
- Large file handling: If Write fails, retry with Bash heredoc silently.
- Never fabricate. No invented evidence, numbers, derivations, citations, or links.
- Never overpromise. Only promise what user explicitly approved.
- Full coverage. Every reviewer concern tracked and accounted for.
- Preserve raw records. Reviews and MCP outputs stored verbatim.
- Global + per-reviewer structure. Shared concerns in opener.
- Answer friendly reviewers too. Reinforce supportive framing.
- Meta-reviewer closing. Summarize resolved/remaining/why accept.
- Evidence > rhetoric. Derivations and numbers over prose.
- Concede selectively. Narrow honest concessions > broad denials.
- Don't waste space on unwinnable arguments. Answer once, move on.
- Respect the limit. Character budget is a hard constraint.
- Resume cleanly. Continue from REBUTTAL_STATE.md on rerun.
- Anti-hallucination citations. Any reference added must go through DBLP → CrossRef → [VERIFY].
Review Tracing
After each
mcp__codex__codex or mcp__codex__codex-reply reviewer call, save the trace following shared-references/review-tracing.md. Use tools/save_trace.sh or write files directly to .aris/traces/<skill>/<date>_run<NN>/. Respect the --- trace: parameter (default: full).