Awesome-Agent-Skills-for-Empirical-Research r-econometrics
Run IV, DiD, and RDD analyses in R with proper diagnostics
install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/31-thalysandratos-claude-code-skills/_skills/analysis/r-econometrics" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-r-econometrics-938c32 && rm -rf "$T"
manifest:
skills/31-thalysandratos-claude-code-skills/_skills/analysis/r-econometrics/SKILL.mdsource content
R Econometrics
Purpose
This skill helps economists run rigorous econometric analyses in R, including Instrumental Variables (IV), Difference-in-Differences (DiD), and Regression Discontinuity Design (RDD). It generates publication-ready code with proper diagnostics and robust standard errors.
When to Use
- Running causal inference analyses
- Estimating treatment effects with panel data
- Creating publication-ready regression tables
- Implementing modern econometric methods (two-way fixed effects, event studies)
Instructions
Step 1: Understand the Research Design
Before generating code, ask the user:
- What is your identification strategy? (IV, DiD, RDD, or simple regression)
- What is the unit of observation? (individual, firm, country-year, etc.)
- What fixed effects do you need? (entity, time, two-way)
- How should standard errors be clustered?
Step 2: Generate Analysis Code
Based on the research design, generate R code that:
- Uses the
package - Modern, fast, and feature-rich for panel datafixest - Includes proper diagnostics:
- For IV: First-stage F-statistics, weak instrument tests
- For DiD: Parallel trends visualization, event study plots
- For RDD: Bandwidth selection, density tests
- Uses robust/clustered standard errors appropriate for the data structure
- Creates publication-ready output using
ormodelsummaryetable
Step 3: Structure the Output
Always include:
# 1. Setup and packages # 2. Data loading and preparation # 3. Descriptive statistics # 4. Main specification # 5. Robustness checks # 6. Visualization # 7. Export results
Step 4: Add Documentation
Include comments explaining:
- Why each specification choice was made
- Interpretation of key coefficients
- Limitations and assumptions
Example Prompts
- "Run a DiD analysis with state and year fixed effects, clustering at the state level"
- "Estimate the effect of X on Y using Z as an instrument"
- "Create an event study plot showing treatment effects by year"
- "Run a sharp RDD with optimal bandwidth selection"
Example Output
# ============================================ # Difference-in-Differences Analysis # ============================================ # Setup library(tidyverse) library(fixest) library(modelsummary) # Load data df <- read_csv("data.csv") # Prepare treatment variable df <- df %>% mutate( post = year >= treatment_year, treated = state %in% treatment_states, treat_post = treated * post ) # ---------------------------------------- # Main DiD Specification # ---------------------------------------- # Two-way fixed effects did_model <- feols( outcome ~ treat_post | state + year, data = df, cluster = ~state ) # View results summary(did_model) # ---------------------------------------- # Event Study # ---------------------------------------- # Create relative time variable df <- df %>% mutate(rel_time = year - treatment_year) # Event study regression event_study <- feols( outcome ~ i(rel_time, treated, ref = -1) | state + year, data = df, cluster = ~state ) # Plot coefficients iplot(event_study, main = "Event Study: Effect on Outcome", xlab = "Years Relative to Treatment") # ---------------------------------------- # Robustness: Alternative Specifications # ---------------------------------------- # Different clustering did_robust <- feols( outcome ~ treat_post | state + year, data = df, cluster = ~state + year # Two-way clustering ) # ---------------------------------------- # Export Results # ---------------------------------------- modelsummary( list("Main" = did_model, "Two-way Cluster" = did_robust), stars = c('*' = 0.1, '**' = 0.05, '***' = 0.01), output = "results/did_table.tex" )
Requirements
Software
- R 4.0+
Packages
- Fast fixed effects estimationfixest
- Publication-ready tablesmodelsummary
- Data manipulationtidyverse
- Visualizationggplot2
Install with:
install.packages(c("fixest", "modelsummary", "tidyverse"))
Best Practices
- Always cluster standard errors at the level of treatment assignment
- Run pre-trend tests for DiD designs
- Report first-stage F-statistics for IV (should be > 10)
- Use
overfeols
for panel data (faster and more features)lm - Document all specification choices in your code comments
Common Pitfalls
- ❌ Not clustering standard errors at the right level
- ❌ Ignoring weak instruments in IV estimation
- ❌ Using TWFE with staggered treatment timing (use
ordid
instead)sunab() - ❌ Not reporting robustness checks
References
- fixest documentation
- Cunningham (2021) Causal Inference: The Mixtape
- Angrist & Pischke (2009) Mostly Harmless Econometrics
Changelog
v1.0.0
- Initial release with IV, DiD, RDD support