Awesome-Agent-Skills-for-Empirical-Research analyze
End-to-end data analysis dispatching Coder and Data-engineer for implementation, coder-critic for review. Supports R, Stata, Python, Julia. Replaces /data-analysis.
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/16-hsantanna88-clo-author/dot-claude/skills/analyze" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-analyze && rm -rf "$T"
skills/16-hsantanna88-clo-author/dot-claude/skills/analyze/SKILL.mdAnalyze
Run end-to-end data analysis by dispatching the Coder (analysis), Data-engineer (cleaning + figures), and coder-critic (code review).
Input:
$ARGUMENTS — dataset path or description of analysis goal.
Workflow
Step 1: Context Gathering
- Read .claude/references/domain-profile.md for field conventions
- Read strategy memo in
if it existsquality_reports/ - Check CLAUDE.md for language preference (R/Stata/Python/Julia)
- Scan existing scripts in
for project patternsscripts/
Step 2: Data Preparation (if needed)
If raw data provided, dispatch Data-engineer first:
- Clean and wrangle raw data
- Handle missing values, construct variables per strategy memo
- Generate summary statistics table
- Create publication-quality descriptive figures
- Save cleaned data, codebook, and figures
Step 3: Main Analysis
Dispatch Coder agent:
- Stage 0: Data loading (from cleaned data or raw)
- Stage 1: Main specification (from strategy memo or user description)
- Stage 2: Robustness checks
- Stage 3: Publication-ready output (tables to
, figures topaper/tables/
)paper/figures/ - Produce
with all estimates, SEs, and key statistics (MANDATORY)results_summary.md - Save scripts to
(or appropriate language directory)scripts/R/
The Coder follows these principles:
- Script structure: Use the Script Structure Template below
- Packages:
for panel data,fixest
for tables,modelsummary
for figuresggplot2 - Standard errors: Cluster at appropriate level (match treatment assignment)
- Output:
tables for LaTeX,.tex
/.pdf
figures,.png
for intermediate objects.rds - No hardcoded paths. All paths relative to repository root.
- saveRDS everything. Every computed object (estimates, model fits, data frames, summary statistics) gets serialized to
for downstream use by the writer and other agents..rds
Step 4: Code Review
Dispatch coder-critic agent — run the full 12-category checklist:
Strategic (categories 1-3):
- Code-strategy alignment — Does the code implement the strategy memo faithfully? Correct dependent variable, treatment, controls, fixed effects, sample restrictions?
- Sanity checks — Are summary statistics printed before regressions? Do coefficient signs match economic intuition? Are sample sizes reasonable?
- Robustness sufficiency — Are required robustness checks present? Alternative specifications, placebo tests, sensitivity analysis per strategy memo?
Code Quality (categories 4-12): 4. Structure — Does the script follow the standard template? Clear section headers, logical flow from setup to export? 5. Console hygiene — No spurious
print() statements polluting output. Intentional output only.
6. Reproducibility — set.seed() at top if any stochastic elements. No absolute paths. All packages loaded at top. Directory creation with showWarnings = FALSE.
7. Functions — Repeated logic extracted into functions. No copy-paste code blocks with minor variations.
8. Figure quality — Publication-ready: proper axis labels, titles, legends, font sizes. Consistent theme across all figures.
9. RDS pattern — Every computed object (models, data frames, summary stats) saved via saveRDS() for downstream use. Not just final outputs — intermediate objects too.
10. Comments — Section headers present. Non-obvious code commented. No commented-out dead code left behind.
11. Error handling — Graceful handling of missing files, empty data subsets, convergence failures. Informative error messages.
12. Polish — Consistent naming conventions. No magic numbers. Clean whitespace. Professional quality ready for replication package.
If strategy memo exists, cross-reference code against stated design. Save report to
quality_reports/[script]_code_review.md.
Step 5: Fix Issues
If coder-critic finds Critical or Major issues:
- Re-dispatch Coder with specific fixes (max 3 rounds)
- Re-run coder-critic to verify fixes
Step 6: Present Results
- Results summary — key estimates with SEs and interpretation (from
)results_summary.md - Scripts created — paths and descriptions
- Output files — tables in
, figures inpaper/tables/paper/figures/ - Code review score — from coder-critic
- TODO items — missing data, additional specifications needed
Script Structure Template
# ============================================================ # [Descriptive Title] # Author: [from project context] # Purpose: [What this script does] # Inputs: [Data files] # Outputs: [Figures, tables, RDS files] # ============================================================ # 0. Setup ---- library(tidyverse) library(fixest) library(modelsummary) set.seed(42) dir.create("paper/tables", recursive = TRUE, showWarnings = FALSE) dir.create("paper/figures", recursive = TRUE, showWarnings = FALSE) # 1. Data Loading ---- # 2. Exploratory Analysis ---- # 3. Main Analysis ---- # 4. Tables and Figures ---- # 5. Export ---- # saveRDS(model_fit, "scripts/R/output/model_fit.rds") # saveRDS(main_results, "scripts/R/output/main_results.rds")
Results Summary (Mandatory Artifact)
Every analysis run MUST produce
results_summary.md containing:
- All point estimates with standard errors and significance levels
- Sample sizes for each specification
- Key summary statistics (means, medians, standard deviations of main variables)
- Robustness check results (brief table or comparison)
- Any flags or anomalies discovered during analysis
This file is the primary handoff artifact to the writer agent. Without it, the writer cannot draft the results section.
Dual-Language Mode (--dual r,python
)
--dual r,pythonWhen
--dual [lang1,lang2] is provided (e.g., --dual r,python, --dual r,stata):
- Data-engineer runs once — language-agnostic cleaning, saves to
data/cleaned/ - Two Coder agents dispatched in parallel — same strategy memo, different languages
- coder-critic reviews each implementation independently (max 3 rounds each)
- Comparison step — verify numerical alignment per
tolerances:.claude/references/domain-profile.md- Point estimates must match within declared tolerance
- Standard errors must match within declared tolerance
- Flag any divergences with exact values from both languages
- Save comparison report to
quality_reports/cross_language_comparison.md
Replication Tolerance Approach
Inspired by Scott Cunningham's replication methodology: if two independent implementations agree, neither has a bug. This is the core rationale for dual-language mode.
Tolerance thresholds:
- Floating-point differences are normal. Minor numerical differences (e.g., 1e-10) between R and Python/Stata arise from different linear algebra backends, optimizer defaults, and floating-point arithmetic. These are expected, not bugs.
- Point estimates: Must agree within 1e-6 (relative) or as declared in
domain-profile.md - Standard errors: Must agree within 1e-4 (relative) — SE computation varies more across implementations due to degrees-of-freedom corrections and clustering algorithms
- P-values: Must agree on significance at conventional levels (0.01, 0.05, 0.10). If one language says p=0.049 and the other says p=0.051, flag for manual review but do not treat as a bug.
- Sample sizes: Must match exactly. Any discrepancy indicates a data handling difference that must be resolved.
When results diverge beyond tolerance:
- Both Coder agents are re-dispatched to investigate
- Check: different default options (e.g., na.rm handling, convergence criteria)
- Check: different variable coding or factor ordering
- The comparison report includes a side-by-side table of all estimates
- If divergence persists after investigation, escalate to user with exact values from both languages
Principles
- Reproduce, don't guess. If the user specifies a regression, run exactly that.
- Show your work. Print summary statistics before jumping to regressions.
- Strategy alignment. If strategy memo exists, code MUST implement it faithfully.
- Worker-critic pairing. Coder creates, coder-critic critiques. Never skip review.
- saveRDS everything. Every computed object gets saved via
for downstream use — model fits, cleaned data frames, summary statistics, not just final tables.saveRDS() - Publication-ready output. Tables and figures directly includable in the paper.
- Cross-language convergence. When
is used, divergence is a bug until proven otherwise.--dual