Awesome-Agent-Skills-for-Empirical-Research rebuttal

Workflow 4: Submission rebuttal pipeline. Parses external reviews, enforces coverage and grounding, drafts a safe text-only rebuttal under venue limits, and manages follow-up rounds. Use when user says \"rebuttal\", \"reply to reviewers\", \"ICML rebuttal\", \"OpenReview response\", or wants to answer external reviews safely.

install

source · Clone the upstream repo

git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/42-wanshuiyin-ARIS/skills/rebuttal" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-rebuttal && rm -rf "$T"

manifest: skills/42-wanshuiyin-ARIS/skills/rebuttal/SKILL.md

source content

Workflow 4: Rebuttal

Prepare and maintain a grounded, venue-compliant rebuttal for: $ARGUMENTS

Scope

This skill is optimized for:

ICML-style text-only rebuttal
strict character limits
multiple reviewers
follow-up rounds after the initial rebuttal
safe drafting with no fabrication, no overpromise, and full issue coverage

This skill does not:

run new experiments automatically
generate new theorem claims automatically
edit or upload a revised PDF
submit to OpenReview / CMT / HotCRP

If the user already has new results, derivations, or approved commitments, the skill can incorporate them as user-confirmed evidence.

Lifecycle Position

Workflow 1:   idea-discovery
Workflow 1.5: experiment-bridge
Workflow 2:   auto-review-loop (pre-submission)
Workflow 3:   paper-writing
Workflow 4:   rebuttal (post-submission external reviews)

Constants

VENUE =
ICML
— Default venue. Override if needed.
RESPONSE_MODE =
TEXT_ONLY
— v1 default.
REVIEWER_MODEL =
gpt-5.4
— Used via Codex MCP for internal stress-testing.
MAX_INTERNAL_DRAFT_ROUNDS = 2 — draft → lint → revise.
MAX_STRESS_TEST_ROUNDS = 1 — One Codex MCP critique round.
MAX_FOLLOWUP_ROUNDS = 3 — per reviewer thread.
AUTO_EXPERIMENT = false — When
```
true
```
, automatically invoke
```
/experiment-bridge
```
to run supplementary experiments when the strategy plan identifies reviewer concerns that require new empirical evidence. When
```
false
```
(default), pause and present the evidence gap to the user for manual handling.
QUICK_MODE = false — When
```
true
```
, only run Phase 0-3 (parse reviews, atomize concerns, build strategy). Outputs
```
ISSUE_BOARD.md
```
+
```
STRATEGY_PLAN.md
```
and stops — no drafting, no stress test. Useful for quickly understanding what reviewers want before deciding how to respond.
REBUTTAL_DIR =
rebuttal/

Override:

/rebuttal "paper/" — venue: NeurIPS, character limit: 5000

Required Inputs

Paper source — PDF, LaTeX directory, or narrative summary
Raw reviews — pasted text, markdown, or PDF with reviewer IDs
Venue rules — venue name, character/word limit, text-only or revised PDF allowed
Current stage — initial rebuttal or follow-up round

If venue rules or limit are missing, stop and ask before drafting.

Safety Model

Three hard gates — if any fails, do NOT finalize:

Provenance gate — every factual statement maps to:
```
paper
```
,
```
review
```
,
```
user_confirmed_result
```
,
```
user_confirmed_derivation
```
, or
```
future_work
```
. No source = blocked.
Commitment gate — every promise maps to:
```
already_done
```
,
```
approved_for_rebuttal
```
, or
```
future_work_only
```
. Not approved = blocked.
Coverage gate — every reviewer concern ends in:
```
answered
```
,
```
deferred_intentionally
```
, or
```
needs_user_input
```
. No issue disappears.

Workflow

Phase 0: Resume or Initialize

If
```
rebuttal/REBUTTAL_STATE.md
```
exists → resume from recorded phase
Otherwise → create
```
rebuttal/
```
, initialize all output documents
Load paper, reviews, venue rules, any user-confirmed evidence

Phase 1: Validate Inputs and Normalize Reviews

Validate venue rules are explicit
Normalize all reviewer text into
```
rebuttal/REVIEWS_RAW.md
```
(verbatim)
Record metadata in
```
rebuttal/REBUTTAL_STATE.md
```
If ambiguous, pause and ask

Phase 2: Atomize and Classify Reviewer Concerns

Create

rebuttal/ISSUE_BOARD.md

For each atomic concern:

```
issue_id
```
(e.g., R1-C2)
```
reviewer
```
,
```
round
```
,
```
raw_anchor
```
(short quote)
```
issue_type
```
: assumptions / theorem_rigor / novelty / empirical_support / baseline_comparison / complexity / practical_significance / clarity / reproducibility / other
```
severity
```
: critical / major / minor
```
reviewer_stance
```
: positive / swing / negative / unknown
```
response_mode
```
: direct_clarification / grounded_evidence / nearest_work_delta / assumption_hierarchy / narrow_concession / future_work_boundary
```
status
```
: open / answered / deferred / needs_user_input

Phase 3: Build Strategy Plan

Create

rebuttal/STRATEGY_PLAN.md

Identify 2-4 global themes resolving shared concerns
Choose response mode per issue
Build character budget (10-15% opener, 75-80% per-reviewer, 5-10% closing)
Identify blocked claims (ungrounded or unapproved)
If unresolved blockers → pause and present to user

QUICK_MODE exit: If

QUICK_MODE = true

, stop here. Present

ISSUE_BOARD.md

STRATEGY_PLAN.md

to the user and summarize: how many issues per reviewer, shared vs unique concerns, recommended priorities, and evidence gaps. The user can then decide to continue with full rebuttal (

/rebuttal — quick mode: false

) or write manually.

Phase 3.5: Evidence Sprint (when AUTO_EXPERIMENT = true)

Skip entirely if

AUTO_EXPERIMENT

is
false
— instead, pause and present the evidence gaps to the user.

If the strategy plan identifies issues that require new empirical evidence (tagged

response_mode: grounded_evidence

with

evidence_source: needs_experiment

Generate a mini experiment plan from the reviewer concerns:
- What to run (ablation, baseline comparison, scale-up, condition check)
- Success criterion (what result would satisfy the reviewer)
- Estimated GPU-hours

Invoke

/experiment-bridge

with the mini plan:

/experiment-bridge "rebuttal/REBUTTAL_EXPERIMENT_PLAN.md"

Wait for results, then update
```
ISSUE_BOARD.md
```
:
- Tag completed experiments as
```
user_confirmed_result
```
- Update evidence source for relevant issue cards
If experiments fail or are inconclusive:
- Switch response mode to
```
narrow_concession
```
  or
```
future_work_boundary
```
- Do NOT fabricate positive results
Save experiment results to
```
rebuttal/REBUTTAL_EXPERIMENTS.md
```
for provenance tracking.

Time guard: If estimated GPU-hours exceed rebuttal deadline, skip and flag for manual handling.

Phase 4: Draft Initial Rebuttal

Create

rebuttal/REBUTTAL_DRAFT_v1.md

Structure:

Short opener — thank reviewers + 2-4 global resolutions
Per-reviewer numbered responses — answer → evidence → implication
Short closing — resolved / remaining / acceptance case

Default reply pattern per issue:

Sentence 1: direct answer
Sentence 2-4: grounded evidence
Last sentence: implication for the paper

Heuristics from 5 successful rebuttals:

Evidence > assertion
Global narrative first, per-reviewer detail second
Concrete numbers for counter-intuitive points
Name closest prior work + exact delta for novelty disputes
Concede narrowly when reviewer is right
For theory: separate core vs technical assumptions
Answer friendly reviewers too

Hard rules:

NEVER invent experiments, numbers, derivations, citations, or links
NEVER promise what user hasn't approved
If no strong evidence exists, say less not more

Also generate

rebuttal/PASTE_READY.txt

(plain text, exact character count).

Phase 5: Safety Validation

Run all lints:

Coverage — every issue maps to draft anchor
Provenance — every factual sentence has source
Commitment — promises are approved
Tone — flag aggressive/submissive/evasive phrases
Consistency — no contradictions across reviewer replies
Limit — exact character count, compress if over (redundancy → friendly → opener → wording, never drop critical answers)

Phase 6: Codex MCP Stress Test

mcp__codex__codex:
  config: {"model_reasoning_effort": "xhigh"}
  prompt: |
    Stress-test this rebuttal draft:
    [raw reviews + issue board + draft + venue rules]

    1. Unanswered or weakly answered concerns?
    2. Unsupported factual statements?
    3. Risky or unapproved promises?
    4. Tone problems?
    5. Paragraph most likely to backfire with meta-reviewer?
    6. Minimal grounded fixes only. Do NOT invent evidence.

    Verdict: safe to submit / needs revision

Save full response to

rebuttal/MCP_STRESS_TEST.md

. If hard safety blocker → revise before finalizing.

Phase 7: Finalize — Two Versions

Produce two outputs for different purposes:

```
rebuttal/PASTE_READY.txt
```
— the strict version
- Plain text, exact character count, fits venue limit
- Ready to paste directly into OpenReview / CMT / HotCRP
- No markdown formatting, no extras
```
rebuttal/REBUTTAL_DRAFT_rich.md
```
— the extended version
- Same structure but with more detail: fuller explanations, additional evidence, optional paragraphs
- Marked with
```
[OPTIONAL — cut if over limit]
```
  for sections that exceed the strict version
- Author can read this to understand the full reasoning, then manually decide what to keep/cut/rewrite
- Useful for follow-up rounds — the extra material is pre-written
Update
```
rebuttal/REBUTTAL_STATE.md
```
Present to user:
- ```
PASTE_READY.txt
```
  character count vs venue limit
- ```
REBUTTAL_DRAFT_rich.md
```
  for review and manual editing
- Remaining risks + lines needing manual approval

Phase 8: Follow-Up Rounds

When new reviewer comments arrive:

Append verbatim to
```
rebuttal/FOLLOWUP_LOG.md
```
Link to existing issues or create new ones
Draft delta reply only (not full rewrite)
Re-run safety lints
Use Codex MCP reply for continuity if useful
Rules: escalate technically not rhetorically; concede if reviewer is correct; stop arguing if reviewer is immovable and no new evidence exists

Key Rules

Large file handling: If Write fails, retry with Bash heredoc silently.
Never fabricate. No invented evidence, numbers, derivations, citations, or links.
Never overpromise. Only promise what user explicitly approved.
Full coverage. Every reviewer concern tracked and accounted for.
Preserve raw records. Reviews and MCP outputs stored verbatim.
Global + per-reviewer structure. Shared concerns in opener.
Answer friendly reviewers too. Reinforce supportive framing.
Meta-reviewer closing. Summarize resolved/remaining/why accept.
Evidence > rhetoric. Derivations and numbers over prose.
Concede selectively. Narrow honest concessions > broad denials.
Don't waste space on unwinnable arguments. Answer once, move on.
Respect the limit. Character budget is a hard constraint.
Resume cleanly. Continue from REBUTTAL_STATE.md on rerun.
Anti-hallucination citations. Any reference added must go through DBLP → CrossRef → [VERIFY].