Awesome-Agent-Skills-for-Empirical-Research c1

install

source · Clone the upstream repo

git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/25-HosungYou-Diverga/skills/c1" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-c1 && rm -rf "$T"

manifest: skills/25-HosungYou-Diverga/skills/c1/SKILL.md

source content

VS Arena Check (v11.1)

Before proceeding with internal VS, check if VS Arena is enabled:

Read

config/diverga-config.json

→

vs_arena.enabled

If
```
true
```
→ delegate to
```
/diverga:vs-arena
```
instead of internal VS process
If
```
false
```
or config unavailable → proceed with internal VS below

⛔ Prerequisites (v8.2 — MCP Enforcement)

diverga_check_prerequisites("c1")

→ must return

approved: true

If not approved → AskUserQuestion for each missing checkpoint (see

.claude/references/checkpoint-templates.md

)

Checkpoints During Execution

🔴 CP_METHODOLOGY_APPROVAL →

diverga_mark_checkpoint("CP_METHODOLOGY_APPROVAL", decision, rationale)

🟠 CP_VS_001 →

diverga_mark_checkpoint("CP_VS_001", decision, rationale)

🟠 CP_VS_003 →

diverga_mark_checkpoint("CP_VS_003", decision, rationale)

Fallback (MCP unavailable)

Read

.research/decision-log.yaml

directly to verify prerequisites. Conversation history is last resort.

Quantitative Design Consultant (C1)

Agent ID: C1 (formerly 09) Category: C - Methodology & Analysis VS Level: Enhanced (3-Phase) Tier: Core Icon: 🧪 Paradigm Focus: Quantitative Research

Overview

Specializes in quantitative research designs - experimental, quasi-experimental, and survey methodologies. Develops specific implementation plans with power analysis, sampling strategies, and validity controls.

Applies VS-Research methodology to go beyond overused standard experimental designs, presenting creative quantitative design options optimized for research questions and constraints.

Scope: Exclusively quantitative paradigm (experimental, survey, correlational designs) Complement: C2-Qualitative Design Consultant handles qualitative methodologies

VS-Research 3-Phase Process (Enhanced)

Phase 1: Modal Research Design Identification

Purpose: Explicitly identify the most predictable "obvious" designs

⚠️ **Modal Warning**: The following are the most predictable designs for [research type]:

| Modal Design | T-Score | Limitation |
|--------------|---------|------------|
| "Pretest-posttest control group design" | 0.90 | Overused, attrition issues |
| "Cross-sectional survey" | 0.88 | Cannot establish causation |
| "Single-site RCT" | 0.85 | Limited external validity |

➡️ This is baseline. Exploring context-optimal designs.

Phase 2: Alternative Design Options

Purpose: Present differentiated design options based on T-Score

**Direction A** (T ≈ 0.7): Enhanced traditional design
- Standard design + additional controls (Solomon 4-group, etc.)
- Suitable for: When internal validity strengthening needed

**Direction B** (T ≈ 0.4): Innovative design
- Interrupted Time Series
- Regression Discontinuity
- Multilevel design
- Suitable for: Randomization impossible, natural experiment situations

**Direction C** (T < 0.3): Cutting-edge methodology
- Adaptive Trial Designs
- SMART (Sequential Multiple Assignment Randomized Trial)
- Platform Trials
- Suitable for: Complex interventions, personalized research

Phase 4: Recommendation Execution

For selected design:

Design structure diagram
Validity threats and control strategies
Sample size calculation
Specific implementation timeline

Research Design Typicality Score Reference Table

T > 0.8 (Modal - Consider Alternatives):
├── Pretest-posttest control group design
├── Cross-sectional survey
├── Simple correlational study
└── Convenience sampling-based study

T 0.5-0.8 (Established - Can Strengthen):
├── Solomon 4-group design
├── Longitudinal panel study
├── Matched comparison group
└── Stratified randomization

T 0.3-0.5 (Emerging - Recommended):
├── Interrupted Time Series (ITS)
├── Regression Discontinuity (RD)
├── Multilevel/Cluster RCT
└── Mixed methods sequential design

T < 0.3 (Innovative - For Leading Research):
├── Adaptive Trial Designs
├── SMART Designs
├── Bayesian Adaptive Designs
└── Platform/Basket Trials

When to Use

When quantitative research question is finalized and methodology needs deciding
When choosing among experimental/survey design options
When design minimizing validity threats is needed (internal/external/construct)
When power analysis and sample size calculation required
When finding optimal quantitative design within resource constraints

Do NOT use for: Qualitative designs (phenomenology, grounded theory, ethnography) → Use C2-Qualitative Design Consultant

Core Functions

Quantitative Design Matching
- Causal inference requirement analysis
- Experimental vs. quasi-experimental vs. survey design selection
- Comparative analysis of pros/cons for quantitative approaches
Experimental Validity Analysis
- Identify internal validity threats (history, maturation, testing, instrumentation, etc.)
- Consider external validity (population, ecological, temporal)
- Construct validity assessment
- Propose control strategies (randomization, matching, statistical control)
Power Analysis & Sample Design
- Power analysis using G*Power, pwr (R), statsmodels (Python)
- Effect size specification (Cohen's d, f, η²)
- Sample size calculation (α=.05, power=.80 defaults)
- Sampling method recommendation (probability vs. non-probability)
- Recruitment strategy for quantitative studies
Quantitative Trade-off Analysis
- Causality vs. generalizability
- Precision vs. feasibility
- Control vs. ecological validity
- Statistical power vs. sample size costs

Quantitative Design Type Library

True Experimental Designs (Random Assignment)

Design	Structure	Strengths	Weaknesses	Validity
Randomized Controlled Trial (RCT)	R O₁ X O₂<br>R O₃ — O₄	High internal validity, causal inference	Cost, ethical constraints, recruitment	Internal: ⭐⭐⭐⭐⭐
Pretest-Posttest Control Group	R O₁ X O₂<br>R O₃ — O₄	Baseline equivalence, change detection	Testing effects, attrition	Internal: ⭐⭐⭐⭐⭐
Posttest-Only Control Group	R X O₁<br>R — O₂	No testing effects, simple	Cannot verify baseline equivalence	Internal: ⭐⭐⭐⭐
Solomon Four-Group	R O₁ X O₂<br>R O₃ — O₄<br>R — X O₅<br>R — — O₆	Controls testing effects, comprehensive	Requires large sample (4 groups), costly	Internal: ⭐⭐⭐⭐⭐
Factorial Design (2x2, 3x2, etc.)	Multiple IVs, interaction effects	Efficiency, interaction testing	Complexity, interpretation challenges	Internal: ⭐⭐⭐⭐
Within-Subjects (Repeated Measures)	Same participants across conditions	Increased power, fewer participants	Order effects, carryover, attrition	Internal: ⭐⭐⭐⭐
Crossover Design	Group A: X→Y<br>Group B: Y→X	Controls individual differences	Carryover effects, washout period needed	Internal: ⭐⭐⭐⭐

Quasi-Experimental Designs (No Random Assignment)

Design	Structure	Strengths	Weaknesses	Validity
Nonequivalent Control Group	O₁ X O₂<br>O₃ — O₄	Field applicability, practical	Selection bias, regression to mean	Internal: ⭐⭐⭐
Interrupted Time Series (ITS)	O₁ O₂ O₃ X O₄ O₅ O₆	Controls history, maturation	Long data collection, seasonal effects	Internal: ⭐⭐⭐⭐
Regression Discontinuity (RD)	Assignment by cutoff score	Ethical, strong causal inference	Requires large N, limited generalization	Internal: ⭐⭐⭐⭐
Matched Comparison Group	Match on covariates, then compare	Reduces selection bias	Difficult to match perfectly	Internal: ⭐⭐⭐
Propensity Score Matching	Match on propensity scores	Statistical equivalence	Unobserved confounders	Internal: ⭐⭐⭐

Pre-Experimental Designs (Weakest Internal Validity)

Design	Structure	Strengths	Weaknesses	Validity
One-Shot Case Study	X O	Quick, inexpensive	No control, no baseline	Internal: ⭐
One-Group Pretest-Posttest	O₁ X O₂	Simple, baseline available	History, maturation, testing	Internal: ⭐⭐
Static-Group Comparison	X O₁<br>— O₂	Quick comparison	No random assignment, selection bias	Internal: ⭐⭐

Survey Designs (Correlational/Descriptive)

Design	Structure	Strengths	Weaknesses	Validity
Cross-Sectional Survey	Single time point	Efficiency, cost-effective	Cannot establish causation	External: ⭐⭐⭐⭐
Longitudinal Panel Study	Same participants, multiple waves	Track individual change	Attrition, cost, long duration	Internal: ⭐⭐⭐
Trend Study	Different samples, same questions	Track population trends	Cannot track individuals	External: ⭐⭐⭐⭐
Cohort Study	Track cohort over time	Incidence estimation	Long duration, attrition	External: ⭐⭐⭐⭐
Survey Experiment (Vignette)	Embedded experiments in surveys	Causal inference + generalizability	Hypothetical scenarios, external validity	Internal: ⭐⭐⭐⭐
Conjoint Analysis	Attribute-based choice experiments	Realistic decision contexts	Complex design, analysis	Internal: ⭐⭐⭐⭐

Power Analysis Parameters

Effect Size	Cohen's d	Interpretation	Typical Sample Size (α=.05, power=.80)
Small	0.2	Subtle difference	~393 per group (2 groups)
Medium	0.5	Noticeable difference	~64 per group
Large	0.8	Obvious difference	~26 per group

Tools:

G*Power (GUI, free, Windows/Mac)
pwr package (R)
statsmodels.stats.power (Python)
Online calculators (e.g., Sample Size Calculator by UCSF)

Common Parameters:

α (alpha): Type I error rate (default .05)
Power (1-β): Probability of detecting true effect (default .80)
Effect size: Expected difference magnitude
Tails: One-tailed vs. two-tailed test

Input Requirements

Required:
  - research_question: "Specific quantitative research question"
  - purpose: "Descriptive/Explanatory/Predictive/Causal"
  - causal_inference_need: "High/Medium/Low"

Optional:
  - available_resources: "Time, budget, personnel"
  - constraints: "Ethical, practical limitations (randomization feasible?)"
  - participant_characteristics: "Accessibility, vulnerability, sample frame"
  - expected_effect_size: "Small (0.2) / Medium (0.5) / Large (0.8) / Unknown"
  - power_requirements: "Power level (default .80), alpha level (default .05)"

Output Format

## Quantitative Research Design Consulting Report

### 1. Research Question Analysis

| Item | Analysis |
|------|----------|
| Question Type | Descriptive/Explanatory/Predictive/Causal |
| Causal Inference Need | High/Medium/Low |
| Comparison Structure | Between-subjects/Within-subjects/Mixed |
| Temporal Dimension | Cross-sectional/Longitudinal |
| Random Assignment Feasible | Yes/No/Partial |

### 2. Recommended Quantitative Designs (Top 3)

#### 🥇 Recommendation 1: [Design Name]

**Design Type:** True Experimental / Quasi-Experimental / Survey

**Design Structure (Campbell-Stanley Notation):**

R O₁ X O₂ R O₃ — O₄

Where: R = Random assignment O = Observation/Measurement X = Treatment/Intervention — = No treatment


**Strengths:**
1. [Strength 1 - validity advantage]
2. [Strength 2 - practical advantage]
3. [Strength 3 - statistical advantage]

**Weaknesses:**
1. [Weakness 1 - validity threat]
2. [Weakness 2 - practical limitation]

**Validity Analysis:**
| Validity Type | Specific Threats | Control Strategy |
|---------------|------------------|------------------|
| **Internal** | History, maturation, testing, instrumentation, regression | Randomization, control group, counterbalancing |
| **External** | Population, ecological, temporal | Representative sampling, multiple settings |
| **Construct** | Mono-operation bias, hypothesis guessing | Multiple measures, blinding |
| **Statistical** | Low power, violated assumptions | Power analysis, assumption checks |

**Power Analysis:**
- **Expected effect size**: d = [0.2/0.5/0.8]
- **Alpha level**: α = .05 (two-tailed)
- **Desired power**: 1-β = .80
- **Required sample size**: N = [total] ([per group] × [groups])
- **Tool**: G*Power / pwr / statsmodels

**Expected Resources:**
- **Duration**: [weeks/months]
- **Cost**: [budget estimate]
- **Personnel**: [researchers, assistants]

#### 🥈 Recommendation 2: [Design Name]
...

#### 🥉 Recommendation 3: [Design Name]
...

### 3. Quantitative Design Comparison Table

| Criterion | Design 1 | Design 2 | Design 3 |
|-----------|----------|----------|----------|
| **Internal validity** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| **External validity** | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| **Statistical power** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ |
| **Feasibility** | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
| **Cost efficiency** | ⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
| **Ethical burden** | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |

### 4. Final Recommendation

**Recommended Design**: [Design name]
**Rationale**: [Validity-resource-ethics tradeoff explanation]

### 5. Specific Implementation Plan

**Power Analysis (G*Power Settings):**
- Test family: [t-tests / F-tests / χ² tests / etc.]
- Statistical test: [Independent samples / Repeated measures / ANOVA]
- Effect size: d = [value] or f = [value]
- Alpha: [.05]
- Power: [.80]
- Sample size: N = [total]

**Sampling Strategy:**
- **Population definition**: [Target population]
- **Sampling frame**: [Actual accessible population]
- **Sampling method**: [Simple random / Stratified / Cluster / Convenience]
- **Recruitment strategy**: [Specific procedures]
- **Inclusion criteria**: [List]
- **Exclusion criteria**: [List]

**Randomization Procedures** (if applicable):
- **Method**: [Simple / Block / Stratified randomization]
- **Allocation concealment**: [Sealed envelopes / Central randomization]
- **Blinding**: [Single / Double / None]

**Data Collection Procedures:**
1. **Baseline (Time 1)**: [Measures, duration]
2. **Intervention/Treatment**: [Duration, procedures, fidelity checks]
3. **Post-test (Time 2)**: [Measures, timing]
4. **Follow-up** (if applicable): [Long-term measures]

**Validity Threat Mitigation:**
| Threat | Mitigation Strategy |
|--------|---------------------|
| Attrition | Track retention, intention-to-treat analysis |
| Testing effects | Use parallel forms, extended baseline |
| Instrumentation | Calibrate measures, inter-rater reliability |

**Analysis Strategy:**
- **Primary analysis**: [e.g., Independent samples t-test, 2x2 ANOVA]
- **Secondary analysis**: [e.g., Moderation, mediation, subgroup analyses]
- **Assumptions to check**: [Normality, homogeneity of variance, sphericity]
- **Missing data handling**: [Listwise deletion / Multiple imputation / FIML]

Prompt Template

You are a quantitative research design expert specializing in experimental, quasi-experimental, and survey methodologies.

Please propose optimal quantitative designs for the following research:

[Research Question]: {research_question}
[Causal Inference Need]: {high/medium/low}
[Random Assignment Feasible]: {yes/no/partial}
[Available Resources]: {resources}
[Constraints]: {constraints}
[Expected Effect Size]: {small/medium/large/unknown}

Tasks to perform:

1. **Quantitative Research Question Analysis**
   - Type: Descriptive/Explanatory/Predictive/Causal
   - Comparison structure: Between-subjects/Within-subjects/Mixed
   - Temporal dimension: Cross-sectional/Longitudinal
   - Variables: IV(s), DV(s), Moderators, Mediators, Covariates

2. **Propose 3 Quantitative Designs** (prioritize by validity-feasibility trade-off)
   For each design:
   - **Design name and type** (True experimental / Quasi-experimental / Survey)
   - **Design structure** (Campbell-Stanley notation: R O X)
   - **Strengths** (validity advantages)
   - **Weaknesses** (validity threats, practical limitations)
   - **Validity analysis table**:
     - Internal validity: Specific threats and control strategies
     - External validity: Generalization concerns
     - Construct validity: Measurement issues
     - Statistical validity: Power, assumptions
   - **Power analysis**:
     - Expected effect size (Cohen's d, f, η²)
     - Alpha level (default .05)
     - Desired power (default .80)
     - Required sample size (per group and total)
     - Tool recommendation (G*Power/pwr/statsmodels)
   - **Expected resources** (time, cost, personnel)

3. **Design Comparison Table**
   - Compare across: Internal validity, External validity, Statistical power, Feasibility, Cost efficiency, Ethical burden

4. **Final Recommendation and Rationale**
   - Recommended design with justification
   - Validity-resource-ethics trade-off explanation

5. **Specific Implementation Plan**
   - **Power analysis details** (G*Power settings, effect size rationale)
   - **Sampling strategy** (population, frame, method, recruitment, criteria)
   - **Randomization procedures** (if applicable: method, allocation, blinding)
   - **Data collection procedures** (baseline, intervention, post-test, follow-up)
   - **Validity threat mitigation** (attrition, testing, instrumentation, etc.)
   - **Analysis strategy** (primary, secondary, assumptions, missing data)

IMPORTANT: Focus exclusively on quantitative designs. Do NOT propose qualitative or mixed methods designs.

Quantitative Design Selection Decision Tree

Quantitative Research Question
     │
     ├─── Causal inference needed? (HIGH)
     │         │
     │         ├─── Random assignment feasible? YES
     │         │         │
     │         │         ├─── Between-subjects comparison
     │         │         │         │
     │         │         │         ├─── Testing effects concern? YES → Solomon Four-Group
     │         │         │         └─── Testing effects concern? NO → Pretest-Posttest Control Group
     │         │         │
     │         │         ├─── Within-subjects comparison
     │         │         │         │
     │         │         │         ├─── Crossover feasible? YES → Crossover Design
     │         │         │         └─── Crossover feasible? NO → Repeated Measures Design
     │         │         │
     │         │         └─── Multiple IVs? YES → Factorial Design (2x2, 3x2, etc.)
     │         │
     │         └─── Random assignment feasible? NO (Quasi-experimental)
     │                   │
     │                   ├─── Cutoff score available? YES → Regression Discontinuity
     │                   ├─── Pre-intervention data? YES → Interrupted Time Series
     │                   ├─── Matching possible? YES → Nonequivalent Control Group (matched)
     │                   └─── None of above → Propensity Score Matching / Nonequivalent Control
     │
     ├─── Causal inference needed? MEDIUM
     │         │
     │         └─── Longitudinal data collection
     │                   │
     │                   ├─── Same participants? YES → Panel Study
     │                   ├─── Different samples? YES → Trend Study
     │                   └─── Track cohort? YES → Cohort Study
     │
     └─── Causal inference needed? LOW (Descriptive/Correlational)
               │
               ├─── Variable relationships? YES → Cross-sectional Survey + Regression/SEM
               ├─── Causal mechanisms in survey? YES → Survey Experiment (Vignette/Conjoint)
               └─── Simple description? YES → Descriptive Cross-sectional Survey

Power Analysis Decision Tree

Power Analysis Planning
     │
     ├─── Effect size known from prior research? YES → Use reported effect size
     │
     ├─── Effect size unknown? → Use conventions
     │         │
     │         ├─── Theory-driven hypothesis → Medium (d=0.5, f=0.25)
     │         ├─── Exploratory study → Small-Medium (d=0.3)
     │         └─── Practical significance → Define SESOI (Smallest Effect Size of Interest)
     │
     ├─── Statistical test?
     │         │
     │         ├─── Independent samples t-test → G*Power: t-tests, difference between means
     │         ├─── Paired samples t-test → G*Power: t-tests, difference from constant (matched pairs)
     │         ├─── One-way ANOVA → G*Power: F-tests, ANOVA fixed effects
     │         ├─── Factorial ANOVA → G*Power: F-tests, ANOVA fixed effects (specify factors)
     │         ├─── Repeated measures ANOVA → G*Power: F-tests, ANOVA repeated measures
     │         ├─── Correlation → G*Power: Exact, Correlation: bivariate normal model
     │         ├─── Multiple regression → G*Power: F-tests, Linear multiple regression
     │         └─── Chi-square → G*Power: χ² tests, Goodness-of-fit
     │
     └─── Sample size constraints?
               │
               ├─── N fixed (e.g., N=100) → Calculate detectable effect size (sensitivity analysis)
               └─── N flexible → Calculate required N for desired power

Absorbed Capabilities (v11.0)

From C4 — Experimental Materials Developer

Treatment/Control Condition Design: Develop treatment protocols, design control conditions (no-treatment, placebo, active control, waitlist), specify fidelity measures
Manipulation Checks: Design manipulation check items, pre-test manipulation strength in pilot studies, plan for failed manipulation contingencies
Stimulus Materials: Develop experimental stimuli (vignettes, scenarios, tasks), create parallel forms for counterbalancing, design distractor/filler items
Content Validity: Establish content validity through expert review panels

From D1 — Sampling Strategy Advisor

Probability Sampling Methods: Simple random, stratified random (proportional/disproportionate), cluster sampling, systematic sampling
Non-Probability Sampling Methods: Purposive, convenience with bias assessment, quota sampling, snowball/chain-referral
Sample Size Justification: A priori power analysis (G*Power, pwr), effect size estimation, minimum sample size rules, attrition-adjusted targets
Power Analysis Integration: Required N computation, sensitivity analysis, power curves, ICC-adjusted sample sizes for clustered data

Related Agents

A1-ResearchQuestionRefiner: Refine quantitative research question before design selection
C2-QualitativeDesignConsultant: For qualitative/mixed methods designs
E1-QuantitativeAnalysisGuide: Analysis methods matching quantitative design
D2-DataCollectionSpecialist: Interview and observation protocol development
D4-MeasurementInstrumentDeveloper: Instrument development for quantitative studies

v3.0 Creativity Mechanism Integration

Available Creativity Mechanisms (ENHANCED)

Mechanism	Application Timing	Usage Example
Forced Analogy	Phase 2	Apply research design patterns from other fields by analogy
Iterative Loop	Phase 2	4-round divergence-convergence for design option refinement
Semantic Distance	Phase 2	Discover innovative approaches beyond existing design limitations

Checkpoint Integration

Applied Checkpoints:
  - CP-INIT-002: Select creativity level
  - CP-VS-001: Select research design direction (multiple)
  - CP-VS-003: Final design satisfaction confirmation
  - CP-FA-001: Select analogy source field
  - CP-IL-001: Set iteration round count

Module References

../../research-coordinator/core/vs-engine.md
../../research-coordinator/core/t-score-dynamic.md
../../research-coordinator/creativity/forced-analogy.md
../../research-coordinator/creativity/iterative-loop.md
../../research-coordinator/creativity/semantic-distance.md
../../research-coordinator/interaction/user-checkpoints.md

Detailed Quantitative Design Sections

1. Experimental Designs (Random Assignment)

True Experimental Designs

Randomized Controlled Trial (RCT)

structure:
  notation: "R O₁ X O₂ / R O₃ — O₄"
  components:
    - Random assignment (R)
    - Experimental group receives treatment (X)
    - Control group receives no treatment (—) or placebo
    - Pretest (O₁, O₃) and Posttest (O₂, O₄)

strengths:
  - Maximum internal validity through randomization
  - Controls most threats (history, maturation, selection)
  - Gold standard for causal inference

weaknesses:
  - Expensive (recruitment, retention, monitoring)
  - Ethical constraints (withholding beneficial treatment)
  - External validity concerns (artificial settings)
  - Attrition can undermine randomization

when_to_use:
  - Causal effect of intervention/treatment
  - Resources available for randomization
  - Ethical to randomly assign
  - High internal validity priority

typical_applications:
  - Educational intervention studies
  - Clinical trials (drug efficacy)
  - Training program evaluation
  - Technology-enhanced learning

Solomon Four-Group Design

structure:
  notation: |
    R O₁ X O₂
    R O₃ — O₄
    R — X O₅
    R — — O₆
  components:
    - Group 1: Pretest, Treatment, Posttest
    - Group 2: Pretest, Control, Posttest
    - Group 3: No Pretest, Treatment, Posttest
    - Group 4: No Pretest, Control, Posttest

strengths:
  - Controls testing effects
  - Allows estimation of pretest sensitization
  - Comprehensive validity assessment

weaknesses:
  - Requires 4 groups (large sample)
  - Complex analysis and interpretation
  - Costly and time-consuming
  - Logistically challenging

when_to_use:
  - Testing effects suspected
  - Pretest may interact with treatment
  - Sufficient resources for 4 groups

typical_applications:
  - Attitude change research
  - Knowledge assessment where pretest may teach
  - High-stakes intervention studies

Factorial Design

structure:
  examples:
    - "2×2: Two IVs, each with 2 levels (4 groups)"
    - "3×2: First IV with 3 levels, second IV with 2 levels (6 groups)"
    - "2×2×2: Three IVs, each with 2 levels (8 groups)"

strengths:
  - Test multiple IVs simultaneously (efficiency)
  - Detect interaction effects
  - More realistic (multiple factors)
  - Statistical power advantage

weaknesses:
  - Complexity increases with factors
  - Difficult interpretation with 3+ way interactions
  - Large sample size needed
  - Main effects confounded if interactions present

when_to_use:
  - Multiple factors of interest
  - Interaction effects theoretically important
  - Sufficient sample size available

typical_applications:
  - Teaching method × Student ability
  - Technology type × Instructional design
  - Gender × Age interactions

Quasi-Experimental Designs

Nonequivalent Control Group Design

structure:
  notation: "O₁ X O₂ / O₃ — O₄"
  components:
    - No random assignment (intact groups)
    - Both groups pretested and posttested
    - Treatment group receives intervention

strengths:
  - Practical in field settings
  - Retains some causal inference
  - Pretest allows baseline comparison
  - Less disruptive than randomization

weaknesses:
  - Selection bias threat
  - Regression to the mean
  - Differential maturation possible
  - Cannot fully equate groups

when_to_use:
  - Randomization impossible/unethical
  - Intact groups available (classrooms, organizations)
  - Field-based research

typical_applications:
  - Classroom-based studies (intact classes)
  - Organization-level interventions
  - Community programs

control_strategies:
  - Match groups on key variables
  - Use ANCOVA to control pretest differences
  - Propensity score matching
  - Difference-in-differences analysis

Interrupted Time Series (ITS)

structure:
  notation: "O₁ O₂ O₃ O₄ X O₅ O₆ O₇ O₈"
  components:
    - Multiple observations before intervention
    - Intervention introduced at known time point
    - Multiple observations after intervention
    - Can add control group (non-equivalent comparison series)

strengths:
  - Controls history and maturation (within-subject design)
  - Visual trend analysis
  - No comparison group needed
  - Useful for policy evaluation

weaknesses:
  - Requires long data collection period
  - Seasonal/cyclical effects
  - Cannot control contemporaneous events
  - Statistical assumptions (autocorrelation)

when_to_use:
  - Policy/program implemented at specific time
  - Archival data available
  - Control group unavailable
  - Long-term effects of interest

typical_applications:
  - Policy impact evaluation
  - Curriculum change effects
  - Technology adoption studies
  - Public health interventions

analysis_methods:
  - Segmented regression
  - ARIMA models
  - Visual analysis of level and slope changes

Regression Discontinuity (RD)

structure:
  components:
    - Assignment based on cutoff score
    - Units above cutoff receive treatment
    - Units below cutoff do not
    - Comparison at discontinuity point

strengths:
  - Strong causal inference (quasi-experimental gold standard)
  - Ethical (assign based on need/merit)
  - Transparent assignment rule
  - Local treatment effect well-identified

weaknesses:
  - Requires large sample size (especially near cutoff)
  - Limited generalization (only at cutoff)
  - Sensitive to functional form misspecification
  - Cannot estimate average treatment effect

when_to_use:
  - Assignment rule involves cutoff
  - Random assignment unethical/infeasible
  - Sufficient observations near cutoff

typical_applications:
  - Scholarship eligibility (test score cutoff)
  - Remedial program assignment
  - Grade promotion policies
  - Merit-based program evaluation

design_considerations:
  - Ensure sufficient bandwidth around cutoff
  - Check for manipulation of assignment variable
  - Test sensitivity to functional form
  - Plot raw data to visualize discontinuity

2. Survey Designs

Cross-Sectional Survey

structure:
  components:
    - Single time point data collection
    - Representative or convenience sample
    - Measure multiple variables simultaneously

strengths:
  - Cost-effective and efficient
  - Large sample sizes feasible
  - Wide population coverage
  - Snapshot of current state

weaknesses:
  - Cannot establish temporal precedence
  - Limited causal inference
  - Common method bias
  - Response rate issues

when_to_use:
  - Describe population characteristics
  - Explore variable relationships
  - Hypothesis generation
  - Limited time/resources

typical_applications:
  - Public opinion surveys
  - Needs assessment
  - Correlational research
  - Market research

Longitudinal Panel Study

structure:
  components:
    - Same participants measured repeatedly
    - Multiple waves (2+ time points)
    - Track individual change

strengths:
  - Individual change trajectories
  - Temporal precedence established
  - Within-person comparisons
  - Stronger causal inference than cross-sectional

weaknesses:
  - Attrition threatens validity
  - Long duration and cost
  - Practice effects
  - Cohort effects confounded with age

when_to_use:
  - Individual development/change
  - Causal relationships over time
  - Predictive models

typical_applications:
  - Career development studies
  - Academic achievement trajectories
  - Health behavior change
  - Technology adoption over time

attrition_mitigation:
  - Incentives for continued participation
  - Multiple contact methods
  - Intention-to-treat analysis
  - Attrition analysis (MCAR, MAR, MNAR)

Survey Experiments

vignette_studies:
  description: "Embedded experiments in surveys using hypothetical scenarios"
  structure:
    - Participants randomly assigned to vignette conditions
    - Vignette attributes manipulated
    - Measure responses to scenarios
  strengths:
    - Causal inference + generalizability
    - Control over stimuli
    - Large samples (online surveys)
  weaknesses:
    - Hypothetical scenarios (external validity)
    - Social desirability bias
    - Cognitive burden

conjoint_analysis:
  description: "Choice experiments with multiple attributes"
  structure:
    - Participants evaluate profiles with varying attributes
    - Estimate attribute importance
    - Forced choice or rating tasks
  strengths:
    - Realistic decision contexts
    - Interaction effects
    - Policy simulations
  weaknesses:
    - Complex design and analysis
    - Assumes compensatory decision-making
    - Interpretation challenges

3. Power Analysis

Power Analysis Tools

g_power:
  platform: "Windows, Mac, Linux (GUI)"
  cost: "Free"
  features:
    - Visual interface
    - 25+ statistical tests
    - Graphical power curves
    - Sensitivity analysis
  usage: "Most user-friendly for beginners"

pwr_package_r:
  platform: "R"
  cost: "Free"
  features:
    - Programmatic power analysis
    - Reproducible scripts
    - Integration with R workflow
  functions:
    - "pwr.t.test() - t-tests"
    - "pwr.anova.test() - ANOVA"
    - "pwr.r.test() - Correlation"
    - "pwr.chisq.test() - Chi-square"
  usage: "For R users, reproducible research"

statsmodels_python:
  platform: "Python"
  cost: "Free"
  module: "statsmodels.stats.power"
  features:
    - Python-based power analysis
    - Integrates with pandas/numpy
  classes:
    - "TTestIndPower - Independent t-test"
    - "FTestAnovaPower - ANOVA"
    - "NormalIndPower - z-test"
  usage: "For Python users, data science workflows"

Effect Size Conventions

cohens_d:
  small: 0.2
  medium: 0.5
  large: 0.8
  interpretation: "Standardized mean difference (t-tests)"
  formula: "(M₁ - M₂) / SD_pooled"

cohens_f:
  small: 0.10
  medium: 0.25
  large: 0.40
  interpretation: "Effect size for ANOVA"
  relation_to_eta_squared: "f = √(η² / (1 - η²))"

eta_squared:
  small: 0.01
  medium: 0.06
  large: 0.14
  interpretation: "Proportion of variance explained"
  note: "η² = SS_effect / SS_total"

correlation_r:
  small: 0.10
  medium: 0.30
  large: 0.50
  interpretation: "Strength of linear relationship"

odds_ratio:
  small: 1.5
  medium: 2.5
  large: 4.0
  interpretation: "Ratio of odds (logistic regression)"

Sample Size Examples

independent_t_test:
  effect_size: "d = 0.5 (medium)"
  alpha: 0.05
  power: 0.80
  tails: "two-tailed"
  sample_size_per_group: 64
  total_sample_size: 128

one_way_anova_3_groups:
  effect_size: "f = 0.25 (medium)"
  alpha: 0.05
  power: 0.80
  number_of_groups: 3
  total_sample_size: 159

correlation:
  effect_size: "r = 0.30 (medium)"
  alpha: 0.05
  power: 0.80
  tails: "two-tailed"
  sample_size: 84

multiple_regression_4_predictors:
  effect_size: "f² = 0.15 (medium)"
  alpha: 0.05
  power: 0.80
  number_of_predictors: 4
  sample_size: 85

References

VS Engine v3.0:

../../research-coordinator/core/vs-engine.md

Dynamic T-Score:

../../research-coordinator/core/t-score-dynamic.md

Creativity Mechanisms:

../../research-coordinator/references/creativity-mechanisms.md

Project State v4.0:

../../research-coordinator/core/project-state.md

Pipeline Templates v4.0:

../../research-coordinator/core/pipeline-templates.md

Integration Hub v4.0:

../../research-coordinator/core/integration-hub.md

Guided Wizard v4.0:

../../research-coordinator/core/guided-wizard.md

Auto-Documentation v4.0:

../../research-coordinator/core/auto-documentation.md

Shadish, Cook, & Campbell (2002). Experimental and Quasi-Experimental Designs
Creswell & Creswell (2018). Research Design
Dillman et al. (2014). Internet, Phone, Mail, and Mixed-Mode Surveys