Awesome-Agent-Skills-for-Empirical-Research workflows:work

Execute research implementation plans efficiently while maintaining estimation quality and finishing features

install

source · Clone the upstream repo

git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/11-James-Traina-compound-science/skills/workflows-work" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-workflows-work && rm -rf "$T"

manifest: skills/11-James-Traina-compound-science/skills/workflows-work/SKILL.md

source content

Work Plan Execution Command

Pipeline mode: This command operates fully autonomously. All decisions are made automatically.

Execute a research implementation plan systematically. The focus is on shipping complete, reproducible research code by understanding requirements quickly, following existing patterns, and maintaining estimation quality throughout.

Input Document

<input_document> #$ARGUMENTS </input_document>

If no input document is provided: Look for the most recent plan in

docs/plans/

and use it. If no plans exist, state "No plan found. Run

/workflows:plan

first." and stop.

Execution Workflow

Phase 1: Quick Start

Read Plan
- Read the work document completely
- Review any references, brainstorm origins, or linked code paths
- Identify the estimation method, identification strategy, and key deliverables
- Note any open questions from planning — resolve by picking the conservative default and documenting the choice
- Proceed immediately — do not wait for approval

Setup Environment

First, detect the project environment:

# Detect estimation language
if [ -f "requirements.txt" ] || [ -f "setup.py" ] || [ -f "pyproject.toml" ]; then
  echo "LANG=python"
elif [ -f "DESCRIPTION" ] || [ -f "renv.lock" ] || [ -f ".Rprofile" ]; then
  echo "LANG=R"
elif [ -f "Project.toml" ]; then
  echo "LANG=julia"
elif ls *.do >/dev/null 2>&1; then
  echo "LANG=stata"
fi

# Detect pipeline tools
ls Makefile Snakefile dvc.yaml 2>/dev/null

Then check the current branch:

current_branch=$(git branch --show-current)
default_branch=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@')
if [ -z "$default_branch" ]; then
  default_branch=$(git rev-parse --verify origin/main >/dev/null 2>&1 && echo "main" || echo "master")
fi

If already on a feature branch (not the default branch):

Continue working on it. Proceed to step 3.

If on the default branch:

Option A: Create a new branch (default)

git pull origin $default_branch
git checkout -b <branch-name-from-plan>

Use a meaningful name derived from the plan (e.g.,

feat/callaway-santanna-did

fix/blp-convergence

Option B: Use a worktree (for parallel estimation runs) See

references/worktree-patterns.md

if the plan involves parallel workstreams or the user has multiple active branches.

Automatically choose Option A unless the plan explicitly mentions parallel workstreams.

Activate Research Environment

Read

compound-science.local.md

for environment configuration. Then activate:

Python:

# Activate virtual environment
if [ -d ".venv" ]; then source .venv/bin/activate
elif [ -d "venv" ]; then source venv/bin/activate
elif command -v conda &>/dev/null; then conda activate $(basename $PWD)
fi
# Verify key packages
python -c "import numpy, scipy, pandas; print('Core packages OK')"

# Check renv status
Rscript -e "if (file.exists('renv.lock')) renv::status()"

Verify data paths:

# Check that referenced data files exist
ls data/ 2>/dev/null | head -5

Create Task List
- Use TodoWrite to break plan into actionable tasks
- Include dependencies between tasks
- Prioritize based on the plan's phase structure
- Include estimation-specific quality check tasks:
  - Convergence verification after each estimation step
  - Standard error computation and diagnostic tests
  - Robustness checks specified in the plan
- Keep tasks specific and completable

Phase 2: Execute

Task Execution Loop

For each task in priority order:

while (tasks remain):
  - Mark task as in_progress in TodoWrite
  - Read any referenced files from the plan
  - Look for similar patterns in codebase
  - Implement following existing conventions
  - Write tests for new functionality
  - Run Estimation Quality Check (see below)
  - Run tests after changes
  - Mark task as completed in TodoWrite
  - Mark off the corresponding checkbox in the plan file ([ ] → [x])
  - Evaluate for incremental commit (see below)

Estimation Quality Check — Before marking an estimation task done:

Check	What to verify
Convergence	Did the optimizer converge? Check exit flag, gradient norm, iteration count. Multiple starting values yield consistent results?
Sensible estimates	Are parameter signs correct? Magnitudes economically reasonable? No values at boundary constraints?
Standard errors	Computed with appropriate method (robust, clustered, bootstrap)? Positive definite Hessian? No suspiciously small or large SEs?
Diagnostics	First-stage F > 10 (if IV)? Overidentification test (if overidentified)? Hausman or specification tests where relevant?
Numerical stability	Log-likelihood (not likelihood) used? Condition number of key matrices acceptable? No NaN/Inf in outputs?
Reproducibility	Random seeds set? Results identical across runs? Dependencies pinned?

When to skip: Pure data cleaning, documentation updates, or pipeline configuration changes that don't involve estimation. If the task is purely additive (new utility function, data loading), the check takes 10 seconds and the answer is "no estimation, skip."

When this matters most: Any change that touches estimation routines, moment conditions, likelihood functions, or simulation code.

IMPORTANT: Always update the original plan document by checking off completed items. Use the Edit tool to change

- [ ]

- [x]

for each task you finish.

Incremental Commits

After completing each task, evaluate whether to create an incremental commit:

Commit when...	Don't commit when...
Estimation step complete with verified convergence	Partial estimation code that won't run
Data pipeline stage verified	Incomplete data transformation
Tests pass + meaningful progress	Tests failing
About to switch contexts (data work → estimation)	Purely scaffolding with no behavior
Robustness check complete	Would need a "WIP" commit message

Heuristic: "Can I write a commit message that describes a complete, verifiable change? If yes, commit."

Commit workflow:

# 1. Verify tests pass (use project's test command)
# Examples: pytest, Rscript tests/run_tests.R, etc.

# 2. Stage only files related to this logical unit
git add <files related to this logical unit>

# 3. Commit with conventional message
git commit -m "feat(estimation): description of this unit"

Note: Incremental commits use clean conventional messages. The final Phase 4 commit/PR includes full attribution.

Follow Existing Patterns
- The plan should reference similar code — read those files first
- Match naming conventions exactly (variable names, function signatures, file organization)
- Reuse existing estimation utilities where possible
- Follow project coding standards (see CLAUDE.md)
- When in doubt, grep for similar implementations
Test Continuously
- Run relevant tests after each significant change
- Don't wait until the end to test
- Fix failures immediately
- Add new tests for new functionality
- For estimation code: verify convergence AND test with known-parameter DGP if feasible
Track Progress
- Keep TodoWrite updated as you complete tasks
- Note any convergence issues or unexpected results
- Create new tasks if scope expands (e.g., new robustness check needed)
- Log estimation results at milestones (point estimates, standard errors, diagnostics)

Phase 3: Quality Check

Run Core Quality Checks

Always run before submitting:

# Run full test suite
# Python: pytest
# R: Rscript tests/run_tests.R or testthat::test_dir("tests")

# Run linting (per CLAUDE.md)
# Python: ruff check . or flake8
# R: lintr::lint_dir()

Estimation-Specific Validation

For any work involving estimation:
- Estimation converges with sensible parameters (check all specifications)
- Standard errors computed correctly (appropriate clustering/robustness)
- Diagnostic tests run and results documented
- Multiple starting values checked (if nonlinear estimation)
- Results reproducible with fixed random seed
- No numerical warnings (NaN, overflow, singular matrices)
Consider Reviewer Agents (Optional)

Use for complex or risky changes. Read agents from
```
compound-science.local.md
```
frontmatter (
```
review_agents
```
). If no settings file, create one following the template in
```
workflows-review/references/project-config.md
```
.

Run configured agents in parallel with Task tool. Address critical issues before proceeding.

Default agents for estimation work:
- ```
econometric-reviewer
```
  — identification and inference review
- ```
numerical-auditor
```
  — numerical stability and convergence
- ```
identification-critic
```
  — identification argument completeness
Final Validation
- All TodoWrite tasks marked completed
- All tests pass
- Linting passes
- Estimation converges with sensible results
- Standard errors and diagnostics computed
- Code follows existing patterns
- Random seeds set and documented
- No console errors or warnings

Phase 4: Ship It

Create Commit

git add <relevant files>
git status  # Review what's being committed
git diff --staged  # Check the changes

git commit -m "$(cat <<'EOF'
feat(estimation): description of what and why

Brief explanation if needed.

Co-Authored-By: Claude <noreply@anthropic.com>
EOF
)"

Create Pull Request

git push -u origin <branch-name>

gh pr create --title "feat(estimation): [Description]" --body "$(cat <<'EOF'
## Summary
- What was implemented
- Methodological approach and key decisions
- Estimation results summary (if applicable)

## Estimation Quality
- Convergence: [status]
- Diagnostics: [first-stage F, overid test, specification tests]
- Robustness: [alternative specifications checked]

## Testing
- Tests added/modified
- Estimation verified with [approach]

## Reproducibility
- Random seeds: [set/documented]
- Pipeline: [runs end-to-end / specific steps]
- Dependencies: [pinned in requirements.txt/renv.lock]

## Research Impact
- Identification: [any changes to assumptions]
- Estimation: [computational cost, convergence]
- Robustness: [new checks added/updated]
- Replication: [package changes]
EOF
)"

Update Plan Status

If the input document has YAML frontmatter with a

status

field, update it:

status: active  →  status: completed

Summary
- Display what was completed
- Link to PR
- Summarize estimation results if applicable
- Note any follow-up work needed (additional robustness checks, referee suggestions)

Phase 5: Handoff

Pipeline mode (when invoked from

/lfg

/slfg

Skip the interactive menu
Auto-invoke
```
/workflows:review
```
on the files that were changed

Standalone mode (when invoked directly by the user):

After the Phase 4 summary, present options:
1. Proceed to review (Recommended) — Immediately run
```
/workflows:review
```
  in this session
2. Continue working — Return to Phase 2 task loop for additional implementation
3. End session — Stop here; changes are committed

Swarm Mode (Optional)

For complex plans with multiple independent workstreams, enable swarm mode for parallel execution.

When to Use Swarm Mode

Use Swarm Mode when...	Use Standard Mode when...
Plan has independent estimation specifications	Single estimation pipeline
Multiple robustness checks can run in parallel	Sequential estimation steps
Monte Carlo with independent DGP variants	Simple parameter change
Large replication package with separable components	Small feature or bug fix

Enabling Swarm Mode

To trigger swarm execution, say:

"Make a Task list and launch an army of agent swarm subagents to build the plan"

See

references/orchestration-patterns.md

in the

slfg

skill for detailed swarm patterns and best practices.

Key Principles

Start Fast, Execute Methodically

Read the plan, set up environment, then execute
Don't wait for perfect understanding — resolve ambiguity by picking conservative defaults
The goal is to finish the implementation with verified estimation quality

The Plan is Your Guide

Plans reference existing code, methods papers, and brainstorm decisions — load those references
Follow the plan's phase structure and acceptance criteria
Don't reinvent — match existing patterns in the codebase

Test Estimation Quality Continuously

Verify convergence after each estimation step, not at the end
Check diagnostics as you go — fix issues immediately
Continuous quality checking prevents late-stage surprises

Quality is Built In

Follow existing patterns
Write tests for new code
Verify estimation convergence and diagnostics
Run linting before pushing
Use reviewer agents for complex or risky estimation changes only

Ship Complete, Reproducible Research

Mark all tasks completed before moving on
Don't leave estimation code 80% done — partial results are worse than no results
A finished, reproducible implementation ships; a perfect but incomplete one doesn't
Seeds set, dependencies pinned, pipeline runs end-to-end

Quality Checklist

Before creating PR, verify:

Common Pitfalls to Avoid

Skipping convergence checks — verify estimation converged, don't assume it did
Wrong standard errors — check clustering level, robustness to heteroskedasticity, bootstrap if needed
Missing seeds — set random seeds BEFORE any stochastic computation
Ignoring plan references — the plan has code paths and method citations for a reason
Testing at the end — test continuously or discover convergence failures too late
80% done syndrome — finish the estimation, run diagnostics, compute standard errors
Hardcoded paths — use relative paths, check data directory structure

Routes To

```
/workflows:review
```
— review the implementation
```
/workflows:compound
```
— document solutions discovered during implementation