Awesome-Agent-Skills-for-Empirical-Research rebuttal
Workflow 4: Submission rebuttal pipeline. Parses external reviews, enforces coverage and grounding, drafts a safe text-only rebuttal under venue limits, and manages follow-up rounds. Use when user says \"rebuttal\", \"reply to reviewers\", \"ICML rebuttal\", \"OpenReview response\", or wants to answer external reviews safely.
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/42-wanshuiyin-ARIS/skills/rebuttal" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-rebuttal && rm -rf "$T"
skills/42-wanshuiyin-ARIS/skills/rebuttal/SKILL.mdWorkflow 4: Rebuttal
Prepare and maintain a grounded, venue-compliant rebuttal for: $ARGUMENTS
Scope
This skill is optimized for:
- ICML-style text-only rebuttal
- strict character limits
- multiple reviewers
- follow-up rounds after the initial rebuttal
- safe drafting with no fabrication, no overpromise, and full issue coverage
This skill does not:
- run new experiments automatically
- generate new theorem claims automatically
- edit or upload a revised PDF
- submit to OpenReview / CMT / HotCRP
If the user already has new results, derivations, or approved commitments, the skill can incorporate them as user-confirmed evidence.
Lifecycle Position
Workflow 1: idea-discovery Workflow 1.5: experiment-bridge Workflow 2: auto-review-loop (pre-submission) Workflow 3: paper-writing Workflow 4: rebuttal (post-submission external reviews)
Constants
- VENUE =
— Default venue. Override if needed.ICML - RESPONSE_MODE =
— v1 default.TEXT_ONLY - REVIEWER_MODEL =
— Used via Codex MCP for internal stress-testing.gpt-5.4 - MAX_INTERNAL_DRAFT_ROUNDS = 2 — draft → lint → revise.
- MAX_STRESS_TEST_ROUNDS = 1 — One Codex MCP critique round.
- MAX_FOLLOWUP_ROUNDS = 3 — per reviewer thread.
- AUTO_EXPERIMENT = false — When
, automatically invoketrue
to run supplementary experiments when the strategy plan identifies reviewer concerns that require new empirical evidence. When/experiment-bridge
(default), pause and present the evidence gap to the user for manual handling.false - QUICK_MODE = false — When
, only run Phase 0-3 (parse reviews, atomize concerns, build strategy). Outputstrue
+ISSUE_BOARD.md
and stops — no drafting, no stress test. Useful for quickly understanding what reviewers want before deciding how to respond.STRATEGY_PLAN.md - REBUTTAL_DIR =
rebuttal/
Override:
/rebuttal "paper/" — venue: NeurIPS, character limit: 5000
Required Inputs
- Paper source — PDF, LaTeX directory, or narrative summary
- Raw reviews — pasted text, markdown, or PDF with reviewer IDs
- Venue rules — venue name, character/word limit, text-only or revised PDF allowed
- Current stage — initial rebuttal or follow-up round
If venue rules or limit are missing, stop and ask before drafting.
Safety Model
Three hard gates — if any fails, do NOT finalize:
- Provenance gate — every factual statement maps to:
,paper
,review
,user_confirmed_result
, oruser_confirmed_derivation
. No source = blocked.future_work - Commitment gate — every promise maps to:
,already_done
, orapproved_for_rebuttal
. Not approved = blocked.future_work_only - Coverage gate — every reviewer concern ends in:
,answered
, ordeferred_intentionally
. No issue disappears.needs_user_input
Workflow
Phase 0: Resume or Initialize
- If
exists → resume from recorded phaserebuttal/REBUTTAL_STATE.md - Otherwise → create
, initialize all output documentsrebuttal/ - Load paper, reviews, venue rules, any user-confirmed evidence
Phase 1: Validate Inputs and Normalize Reviews
- Validate venue rules are explicit
- Normalize all reviewer text into
(verbatim)rebuttal/REVIEWS_RAW.md - Record metadata in
rebuttal/REBUTTAL_STATE.md - If ambiguous, pause and ask
Phase 2: Atomize and Classify Reviewer Concerns
Create
rebuttal/ISSUE_BOARD.md.
For each atomic concern:
(e.g., R1-C2)issue_id
,reviewer
,round
(short quote)raw_anchor
: assumptions / theorem_rigor / novelty / empirical_support / baseline_comparison / complexity / practical_significance / clarity / reproducibility / otherissue_type
: critical / major / minorseverity
: positive / swing / negative / unknownreviewer_stance
: direct_clarification / grounded_evidence / nearest_work_delta / assumption_hierarchy / narrow_concession / future_work_boundaryresponse_mode
: open / answered / deferred / needs_user_inputstatus
Phase 3: Build Strategy Plan
Create
rebuttal/STRATEGY_PLAN.md.
- Identify 2-4 global themes resolving shared concerns
- Choose response mode per issue
- Build character budget (10-15% opener, 75-80% per-reviewer, 5-10% closing)
- Identify blocked claims (ungrounded or unapproved)
- If unresolved blockers → pause and present to user
QUICK_MODE exit: If
QUICK_MODE = true, stop here. Present ISSUE_BOARD.md + STRATEGY_PLAN.md to the user and summarize: how many issues per reviewer, shared vs unique concerns, recommended priorities, and evidence gaps. The user can then decide to continue with full rebuttal (/rebuttal — quick mode: false) or write manually.
Phase 3.5: Evidence Sprint (when AUTO_EXPERIMENT = true)
Skip entirely if
is AUTO_EXPERIMENT
— instead, pause and present the evidence gaps to the user.false
If the strategy plan identifies issues that require new empirical evidence (tagged
response_mode: grounded_evidence with evidence_source: needs_experiment):
-
Generate a mini experiment plan from the reviewer concerns:
- What to run (ablation, baseline comparison, scale-up, condition check)
- Success criterion (what result would satisfy the reviewer)
- Estimated GPU-hours
-
Invoke
with the mini plan:/experiment-bridge/experiment-bridge "rebuttal/REBUTTAL_EXPERIMENT_PLAN.md" -
Wait for results, then update
:ISSUE_BOARD.md- Tag completed experiments as
user_confirmed_result - Update evidence source for relevant issue cards
- Tag completed experiments as
-
If experiments fail or are inconclusive:
- Switch response mode to
ornarrow_concessionfuture_work_boundary - Do NOT fabricate positive results
- Switch response mode to
-
Save experiment results to
for provenance tracking.rebuttal/REBUTTAL_EXPERIMENTS.md
Time guard: If estimated GPU-hours exceed rebuttal deadline, skip and flag for manual handling.
Phase 4: Draft Initial Rebuttal
Create
rebuttal/REBUTTAL_DRAFT_v1.md.
Structure:
- Short opener — thank reviewers + 2-4 global resolutions
- Per-reviewer numbered responses — answer → evidence → implication
- Short closing — resolved / remaining / acceptance case
Default reply pattern per issue:
- Sentence 1: direct answer
- Sentence 2-4: grounded evidence
- Last sentence: implication for the paper
Heuristics from 5 successful rebuttals:
- Evidence > assertion
- Global narrative first, per-reviewer detail second
- Concrete numbers for counter-intuitive points
- Name closest prior work + exact delta for novelty disputes
- Concede narrowly when reviewer is right
- For theory: separate core vs technical assumptions
- Answer friendly reviewers too
Hard rules:
- NEVER invent experiments, numbers, derivations, citations, or links
- NEVER promise what user hasn't approved
- If no strong evidence exists, say less not more
Also generate
rebuttal/PASTE_READY.txt (plain text, exact character count).
Phase 5: Safety Validation
Run all lints:
- Coverage — every issue maps to draft anchor
- Provenance — every factual sentence has source
- Commitment — promises are approved
- Tone — flag aggressive/submissive/evasive phrases
- Consistency — no contradictions across reviewer replies
- Limit — exact character count, compress if over (redundancy → friendly → opener → wording, never drop critical answers)
Phase 6: Codex MCP Stress Test
mcp__codex__codex: config: {"model_reasoning_effort": "xhigh"} prompt: | Stress-test this rebuttal draft: [raw reviews + issue board + draft + venue rules] 1. Unanswered or weakly answered concerns? 2. Unsupported factual statements? 3. Risky or unapproved promises? 4. Tone problems? 5. Paragraph most likely to backfire with meta-reviewer? 6. Minimal grounded fixes only. Do NOT invent evidence. Verdict: safe to submit / needs revision
Save full response to
rebuttal/MCP_STRESS_TEST.md. If hard safety blocker → revise before finalizing.
Phase 7: Finalize — Two Versions
Produce two outputs for different purposes:
-
— the strict versionrebuttal/PASTE_READY.txt- Plain text, exact character count, fits venue limit
- Ready to paste directly into OpenReview / CMT / HotCRP
- No markdown formatting, no extras
-
— the extended versionrebuttal/REBUTTAL_DRAFT_rich.md- Same structure but with more detail: fuller explanations, additional evidence, optional paragraphs
- Marked with
for sections that exceed the strict version[OPTIONAL — cut if over limit] - Author can read this to understand the full reasoning, then manually decide what to keep/cut/rewrite
- Useful for follow-up rounds — the extra material is pre-written
-
Update
rebuttal/REBUTTAL_STATE.md -
Present to user:
character count vs venue limitPASTE_READY.txt
for review and manual editingREBUTTAL_DRAFT_rich.md- Remaining risks + lines needing manual approval
Phase 8: Follow-Up Rounds
When new reviewer comments arrive:
- Append verbatim to
rebuttal/FOLLOWUP_LOG.md - Link to existing issues or create new ones
- Draft delta reply only (not full rewrite)
- Re-run safety lints
- Use Codex MCP reply for continuity if useful
- Rules: escalate technically not rhetorically; concede if reviewer is correct; stop arguing if reviewer is immovable and no new evidence exists
Key Rules
- Large file handling: If Write fails, retry with Bash heredoc silently.
- Never fabricate. No invented evidence, numbers, derivations, citations, or links.
- Never overpromise. Only promise what user explicitly approved.
- Full coverage. Every reviewer concern tracked and accounted for.
- Preserve raw records. Reviews and MCP outputs stored verbatim.
- Global + per-reviewer structure. Shared concerns in opener.
- Answer friendly reviewers too. Reinforce supportive framing.
- Meta-reviewer closing. Summarize resolved/remaining/why accept.
- Evidence > rhetoric. Derivations and numbers over prose.
- Concede selectively. Narrow honest concessions > broad denials.
- Don't waste space on unwinnable arguments. Answer once, move on.
- Respect the limit. Character budget is a hard constraint.
- Resume cleanly. Continue from REBUTTAL_STATE.md on rerun.
- Anti-hallucination citations. Any reference added must go through DBLP → CrossRef → [VERIFY].