Awesome-Agent-Skills-for-Empirical-Research review-r
R code review for the sewage project. Checks script structure, reproducibility, function design, figure quality, and professional polish against project conventions (here::here, arrow/parquet, fixest, modelsummary, native pipe). This skill should be used when asked to "review the code", "check my script", or "code review".
install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/41-sticerd-eee-sewage-econometrics-check/skills/review-r" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-review-r-fe5743 && rm -rf "$T"
manifest:
skills/41-sticerd-eee-sewage-econometrics-check/skills/review-r/SKILL.mdsource content
Review R Code
Run a code quality review on R scripts in the sewage project. Produces a report — does NOT edit source files.
Input:
$ARGUMENTS — a .R filename, directory path, or all.
Project-Specific Conventions
Script Structure
Pipeline scripts (layers 01-06):
Header block (########) → roxygen description → initialise_environment() → setup_logging() → CONFIG list → functions → main() → conditional execution (sys.nframe() == 0)
Analysis scripts (09_analysis):
Numbered sections with # === separators, inline package loading, direct execution (no main() wrapper)
Required Conventions
- Paths:
— never relative paths orhere::here()setwd() - Data formats:
(parquet) for intermediate/final; DuckDB for large joinsarrow - Pipe: Native R pipe
(not|>
)%>% - Naming:
for functions/variables,snake_case
for constantsUPPER_SNAKE_CASE - Econometrics:
for regressionsfixest::feols() - Tables:
→ LaTeX withmodelsummary
formattabularray - SE:
for heteroskedasticity-robustvcov = "hetero" - Factors:
/forcats::as_factor()forcats::fct_drop()
Key Files
- Scripts:
throughscripts/R/01_data_ingestion/scripts/R/09_analysis/ - Utilities:
scripts/R/utils/ - Output:
output/{figures,tables,html_plots,regs,log}/
Workflow
Step 1: Identify Scripts
- If
is a specific$ARGUMENTS
file: review that file.R - If
is a directory: review all$ARGUMENTS
files in that directory.R - If
is$ARGUMENTS
: review all scripts inallscripts/R/ - If no argument: ask which scripts to review
Step 2: Review Each Script (9 Categories)
Category 1: Script Structure
- Pipeline scripts: header block, roxygen,
,initialise_environment()
list,CONFIG
, conditional executionmain() - Analysis scripts: numbered sections with
separators# === - Clear separation of concerns
Category 2: Console Hygiene
- No unnecessary
/print()
pollutioncat() - Logging via
in pipeline scriptssetup_logging() - Clean output — only meaningful messages
Category 3: Reproducibility
where randomness is involvedset.seed()- All paths via
— nohere::here()
, no relative paths, no hardcoded absolute pathssetwd() - No hardcoded values that should be in CONFIG
Category 4: Function Design
- DRY — no duplicated code blocks
- Functions at appropriate abstraction level
- Shared utilities in
where reuse is neededscripts/R/utils/
Category 5: Figure Quality
- Axis labels present and readable
- Appropriate dimensions (not too wide/narrow)
- Consistent theme across plots
- Alpha transparency for overlapping points
- Saved to
viaoutput/figures/here::here()
Category 6: Data Persistence
- Intermediate results saved as parquet (
)arrow::write_parquet()
for R-specific objectssaveRDS()- No orphaned temporary files
Category 7: Comments
- Explain why, not what
- Section headers for navigation
- No commented-out dead code
Category 8: Error Handling
- Graceful failures with informative messages in pipeline scripts
- File existence checks before reading
- Data validation where appropriate
Category 9: Polish
- Consistent style (indentation, spacing)
- No dead code or unused imports
- Clean namespace —
calls at toplibrary() - Native pipe
(not magrittr|>
)%>%
Step 3: Present Summary
## Code Review: [filename/directory] **Date:** YYYY-MM-DD **Scripts reviewed:** N ### Issues by Severity | Script | Critical | Major | Minor | |--------|----------|-------|-------| | ... | ... | ... | ... | ### Top 3 Critical Issues 1. ... 2. ... 3. ... ### Conventions Compliance - [ ] here::here() for all paths - [ ] Native pipe |> - [ ] arrow/parquet for data - [ ] fixest for regressions - [ ] modelsummary for tables ### Score: XX / 100
Save report to
output/log/code_review_[target].md.
Step 4: IMPORTANT
Do NOT edit any source files. Only produce reports. Fixes are applied after user review.
Principles
- Report only, never edit. The reviewer is a critic, not a creator.
- Project conventions matter. Flag deviations from the conventions above, not personal preferences.
- Proportional severity. A missing
is Major. A missing comment is Minor. Usingset.seed()
is Critical.setwd() - Language-specific. Check R idioms — vectorised operations over loops, tidyverse patterns, proper use of factors.
- Pipeline vs analysis distinction. Different structure expectations for pipeline scripts vs analysis scripts.