Awesome-Agent-Skills-for-Empirical-Research research-ideation

Generate structured research questions, hypotheses, and empirical strategies from a topic or dataset within the sewage/environmental economics space. This skill should be used when asked to "brainstorm research questions", "what else can we do with this data", "research ideas", or "ideation".

install

source · Clone the upstream repo

git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/41-sticerd-eee-sewage-econometrics-check/skills/research-ideation" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-research-ideation-f7eb9e && rm -rf "$T"

manifest: skills/41-sticerd-eee-sewage-econometrics-check/skills/research-ideation/SKILL.md

source content

Research Ideation

Generate structured research questions, testable hypotheses, and empirical strategies from a topic, phenomenon, or dataset.

Input:

$ARGUMENTS

— a topic, phenomenon, dataset description, or

extensions

to brainstorm extensions of the current paper.

Project-Specific Context

Available Data

EDM data (2021-2024+): Event-level sewage spill records for all overflow sites in England
Land Registry: Universe of property transactions (prices, dates, postcodes)
Zoopla: Rental listings with prices and characteristics
Met Office: Daily rainfall at LSOA level
River networks: PostGIS topology of English waterways
Annual returns: Water company infrastructure data (sewer capacity, population served)
Geographic boundaries: LSOAs, MSOAs, constituencies, water company areas

Existing Identification Strategies

Hedonic, repeat sales, long difference, DiD with media coverage, upstream/downstream, dry spills, hydraulic capacity IV

Related Literature Areas

Environmental disamenity capitalisation, water quality and property values, UK water industry regulation, infrastructure and house prices, information and price discovery

Workflow

Step 1: Understand the Input

Read

$ARGUMENTS

and any referenced files.

If
```
extensions
```
: read current manuscript sections and analysis scripts to understand what's been done
If a topic: ground it in the available data and methods

Step 2: Generate 3-5 Research Questions

Order from descriptive to causal:

Descriptive: What are the patterns? (e.g. "How do spill frequencies vary across water companies?")
Correlational: What factors are associated? (e.g. "Is sewer age correlated with spill frequency after controlling for rainfall?")
Causal: What is the effect? (e.g. "What is the causal effect of media coverage of sewage spills on house prices?")
Mechanism: Why does the effect exist? (e.g. "Do dry spills affect prices through stigma or physical damage?")
Policy: What are the implications? (e.g. "Would mandated real-time spill disclosure affect property markets?")

Step 3: Develop Each Question

For each:

Hypothesis: Testable prediction with expected sign/magnitude
Identification strategy: How to establish causality
Data requirements: What data is needed — is it available in the project?
Key assumptions: What must hold
Potential pitfalls: Threats to identification
Related literature: 2-3 papers using similar approaches

Step 4: Rank by Feasibility and Contribution

RQ	Feasibility	Contribution	Priority
1	High/Med/Low	High/Med/Low	...

Step 5: Save Output

# Research Ideation: [Topic]
**Date:** YYYY-MM-DD

## Overview
[1-2 paragraphs situating the topic]

## Research Questions

### RQ1: [Question] (Feasibility: High/Medium/Low)
**Type:** Descriptive / Correlational / Causal / Mechanism / Policy
**Hypothesis:** [Prediction]
**Identification:** [Method + key assumption]
**Data:** [What's needed and what's available]
**Pitfalls:** [Top 2 threats]
**Related work:** [2-3 papers]

## Ranking
[Table]

## Suggested Next Steps
1. [Most promising direction]
2. [Data to obtain or analysis to run]
3. [Literature to review]

Save to

output/log/research_ideation_[topic].md

Principles

Be creative but grounded. Every suggestion must be empirically feasible with available or obtainable data.
Think like a referee. For each causal question, immediately identify the identification challenge.
Leverage existing infrastructure. The project already has spatial matching, spill aggregation, and river network tools — build on them.
Consider data availability. Flag when data is already available vs needs to be obtained.
Extensions are valuable. For
```
extensions
```
, focus on questions that use the same data infrastructure but ask different questions.