Awesome-Agent-Skills-for-Empirical-Research e1
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/25-HosungYou-Diverga/skills/e1" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-e1 && rm -rf "$T"
skills/25-HosungYou-Diverga/skills/e1/SKILL.md⛔ Prerequisites (v8.2 — MCP Enforcement)
diverga_check_prerequisites("e1") → must return approved: true
If not approved → AskUserQuestion for each missing checkpoint (see .claude/references/checkpoint-templates.md)
Checkpoints During Execution
- 🟠 CP_ANALYSIS_PLAN →
diverga_mark_checkpoint("CP_ANALYSIS_PLAN", decision, rationale)
Fallback (MCP unavailable)
Read
.research/decision-log.yaml directly to verify prerequisites. Conversation history is last resort.
E1-Quantitative Analysis Guide
Agent ID: E1 (formerly 10) Category: E - Publication & Communication (Analysis Methods) VS Level: Full (5-Phase) Tier: Flagship Icon: 📈📊
Overview
Comprehensive guide for both quantitative and qualitative analysis methods appropriate for research design and data characteristics. Applies VS-Research methodology to avoid monotonous analyses like "recommend t-test" or "just do thematic analysis," presenting methodological diversity optimized for research questions across paradigms.
VS-Research 5-Phase Process
Phase 0: Context Collection (MANDATORY)
Must collect before VS application:
Required Context: - research_question: "Relationship/difference to analyze" - independent_variable: "Type (continuous/categorical), number of levels" - dependent_variable: "Type (continuous/categorical), number of levels" - design: "Independent/Repeated/Mixed" Optional Context: - control_variables: "Covariate list" - sample_size: "Current or expected N" - target_journal: "Target journal level"
Phase 1: Modal Analysis Method Identification
Purpose: Explicitly identify the most predictable "obvious" analysis methods
## Phase 1: Modal Analysis Method Identification ⚠️ **Modal Warning**: The following are the most commonly used analyses for this design: | Modal Method | T-Score | Usage Rate | Limitation | |--------------|---------|------------|------------| | [Method1] | 0.92 | 60%+ | [Limitation] | | [Method2] | 0.88 | 25%+ | [Limitation] | ➡️ Confirming if this is optimal and exploring more suitable alternatives.
Phase 2: Long-Tail Analysis Method Sampling
Purpose: Present alternatives at 3 levels based on T-Score
## Phase 2: Long-Tail Analysis Method Sampling **Direction A** (T ≈ 0.7): Standard but enhanced analysis - [Method]: [Description] - Advantages: Familiar to reviewers, slight improvements - Suitable for: Conservative journals **Direction B** (T ≈ 0.45): Modern alternatives - [Method]: [Description] - Advantages: Methodological contribution, more accurate inference - Suitable for: Methodology-oriented journals **Direction C** (T < 0.3): Innovative approaches - [Method]: [Description] - Advantages: Latest methodology, high differentiation - Suitable for: Top-tier journals
Phase 3: Low-Typicality Selection
Purpose: Select method most appropriate for research question and data
Selection Criteria:
- Statistical Fit: Assumption satisfaction, data characteristics
- Research Question Alignment: Optimal for hypothesis testing
- Methodological Contribution: Differentiation potential
- Feasibility: Software, expertise
Phase 4: Execution
Purpose: Provide specific guidance for selected analysis method
## Phase 4: Analysis Execution Guide ### Primary Analysis Method [Specific guidance] ### Assumption Checks [Procedures and code] ### Effect Size [Calculation and interpretation]
Phase 5: Suitability Verification
Purpose: Confirm final selection is optimal for research
## Phase 5: Suitability Verification ✅ Modal Avoidance Check: - [ ] "Was basic t-test/ANOVA sufficient?" → Review complete - [ ] "Are there more suitable modern alternatives?" → Review complete - [ ] "Is methodological contribution possible?" → Confirmed ✅ Quality Check: - [ ] Statistical assumptions satisfied? → YES - [ ] Accurately answers research question? → YES - [ ] Defensible in peer review? → YES
Typicality Score Reference Table
Quantitative Analysis Method T-Score
T > 0.8 (Modal - Explore Alternatives): ├── Independent t-test ├── One-way ANOVA ├── OLS Regression (simple) ├── Pearson correlation └── Chi-square test T 0.5-0.8 (Established - Situational): ├── Factorial ANOVA ├── ANCOVA ├── Multiple regression ├── Hierarchical regression ├── Repeated measures ANOVA ├── Mixed ANOVA └── Traditional Meta-analysis T 0.3-0.5 (Modern - Recommended): ├── Hierarchical Linear Modeling (HLM/MLM) ├── Structural Equation Modeling (SEM) ├── Latent Growth Modeling ├── Bayesian regression ├── Mixed-effects models ├── Meta-Analytic SEM (MASEM) ├── Propensity Score Matching └── Robust methods (bootstrapping) T < 0.3 (Innovative - For Top-tier): ├── Bayesian methods (full) ├── Causal inference (IV, RDD, DiD) ├── Machine Learning + inference (SHAP, causal forests) ├── Network analysis ├── Computational modeling └── Novel hybrid methods (Double ML, Targeted learning)
Qualitative Analysis Method T-Score
T > 0.8 (Modal - Explore Alternatives): ├── Generic thematic analysis ├── Basic content analysis ├── Descriptive coding └── Simple categorization T 0.5-0.8 (Established - Situational): ├── Braun & Clarke thematic analysis (6-phase) ├── Grounded theory (Strauss & Corbin) ├── Directed content analysis ├── Narrative analysis (thematic) ├── Framework analysis └── Template analysis T 0.3-0.5 (Modern - Recommended): ├── Interpretative Phenomenological Analysis (IPA) ├── Constructivist grounded theory (Charmaz) ├── Structural narrative analysis ├── Discourse analysis ├── Reflexive thematic analysis └── Abductive analysis T < 0.3 (Innovative - For Top-tier): ├── Critical discourse analysis (CDA) ├── Foucauldian discourse analysis ├── Situational analysis (Clarke) ├── Dialogic/performance narrative analysis ├── Computational text analysis + qualitative interpretation ├── Visual discourse analysis └── Multimodal analysis
Input Requirements
For Quantitative Analysis
Required: - research_question: "Relationship/difference to analyze" - independent_variable: "Type (continuous/categorical), number of levels" - dependent_variable: "Type (continuous/categorical), number of levels" Optional: - control_variables: "Covariate list" - design: "Independent/Repeated/Mixed" - sample_size: "Current or expected N" - target_journal: "Target journal level"
For Qualitative Analysis
Required: - research_question: "Phenomenon/experience to explore" - data_type: "Interviews/Focus groups/Documents/Visual/Observational" - sample_size: "N participants or texts" Optional: - paradigm: "Interpretive/Critical/Constructivist/Positivist" - prior_theory: "Deductive approach with existing framework?" - software_preference: "NVivo/ATLAS.ti/MAXQDA/Manual" - team_coding: "Multiple coders? Y/N"
Output Format (VS-Enhanced)
## Statistical Analysis Guide (VS-Enhanced) --- ### Phase 1: Modal Analysis Method Identification ⚠️ **Modal Warning**: The following are most commonly recommended analyses for this design: | Modal Method | T-Score | Limitation in This Study | |--------------|---------|--------------------------| | [Method1] | 0.92 | [Specific limitation] | | [Method2] | 0.88 | [Specific limitation] | ➡️ Confirming if this is optimal and exploring more suitable alternatives. --- ### Phase 2: Long-Tail Analysis Method Sampling **Direction A** (T = 0.72): [Standard Enhanced Method] - Method: [Specific method] - Advantages: [Strengths] - Suitable for: [Target] **Direction B** (T = 0.48): [Modern Alternative] - Method: [Specific method] - Advantages: [Strengths] - Suitable for: [Target] **Direction C** (T = 0.28): [Innovative Approach] - Method: [Specific method] - Advantages: [Strengths] - Suitable for: [Target] --- ### Phase 3: Low-Typicality Selection **Selection**: Direction [B] - [Method name] (T = [X.X]) **Selection Rationale**: 1. [Rationale 1 - Statistical fit] 2. [Rationale 2 - Research question alignment] 3. [Rationale 3 - Feasibility] --- ### Phase 4: Analysis Execution Guide #### 1. Analysis Overview | Item | Content | |------|---------| | Research Question | [Question] | | Independent Variable | [Variable name] (Type: [Continuous/Categorical], Levels: [N]) | | Dependent Variable | [Variable name] (Type: [Continuous/Categorical]) | | Control Variables | [Variable name] | | Design | [Independent/Repeated/Mixed] | #### 2. Recommended Analysis Method **Primary Analysis**: [Method name] **Selection Rationale**: - [Rationale 1] - [Rationale 2] **Alternative** (if assumptions violated): [Alternative method] #### 3. Assumption Check Procedures ##### Normality - **Test**: Shapiro-Wilk (N < 50) / K-S (N ≥ 50) - **Visualization**: Q-Q plot, histogram ```r # R code shapiro.test(data$DV) qqnorm(data$DV); qqline(data$DV)
- Interpretation: p > .05 → Normality satisfied
- If violated: [Non-parametric alternative] or bootstrapping
Homogeneity of Variance
- Test: Levene's test
library(car) leveneTest(DV ~ Group, data = data)
- Interpretation: p > .05 → Homogeneity satisfied
- If violated: Welch's correction / robust SE
[Additional assumptions...]
4. Power Analysis
A Priori Analysis
| Parameter | Value |
|---|---|
| Expected effect size | [d = / η² = / f² = ] |
| Significance level (α) | .05 |
| Power (1-β) | .80 |
| Required sample size | N = [calculated value] |
# G*Power or R pwr package library(pwr) pwr.t.test(d = 0.5, sig.level = 0.05, power = 0.80, type = "two.sample")
Sensitivity Analysis
- Minimum detectable effect size with current N: [d = ]
5. Analysis Code
# R code - Primary analysis library(tidyverse) library(effectsize) # 1. Load data data <- read_csv("data.csv") # 2. Descriptive statistics data %>% group_by(Group) %>% summarise( n = n(), mean = mean(DV), sd = sd(DV) ) # 3. Primary analysis model <- [analysis function] # 4. Effect size [effect size calculation code]
# Python code (alternative) import pandas as pd import scipy.stats as stats import pingouin as pg # [Same analysis in Python]
6. Effect Size Interpretation
| Effect Size | Value | Interpretation (Cohen's criteria) | Practical Meaning |
|---|---|---|---|
| [Metric] | [Value] | [Small/Medium/Large] | [Interpretation] |
Interpretation Criteria (Cohen, 1988):
| Metric | Small | Medium | Large |
|---|---|---|---|
| d | 0.2 | 0.5 | 0.8 |
| η² | .01 | .06 | .14 |
| r | .10 | .30 | .50 |
| f² | .02 | .15 | .35 |
7. Multiple Comparisons (if applicable)
Correction Method: [Bonferroni / Tukey / FDR]
- Number of comparisons: [k]
- Corrected α: [α/k or FDR adjusted]
# R code - Multiple comparison correction p.adjust(p_values, method = "BH") # Benjamini-Hochberg FDR
8. Results Reporting Format (APA 7th)
[Analysis method] results showed [statistic] was statistically significant[/not significant], [statistic = X.XX, p = .XXX, effect size = X.XX, 95% CI [X.XX, X.XX]].
Example (selected analysis): "[Method name] results showed that [variable]'s effect on [variable] was statistically significant, [statistic], [effect size], 95% CI [X.XX, X.XX]."
Phase 5: Suitability Verification
✅ Modal Avoidance Check:
- Confirmed selection rationale for [selected analysis] over basic analysis
- Reviewed more suitable modern alternatives
- Confirmed methodological contribution potential
✅ Quality Assurance:
- Assumption check procedures included
- Effect size and confidence interval calculations
- APA format results reporting prepared
--- ## Qualitative Analysis Methods (NEW in v5.0) ### Thematic Analysis **Approach**: Braun & Clarke 6-Phase Framework ```yaml thematic_analysis: phases: phase_1_familiarization: activities: - "Read and re-read data" - "Note initial ideas" - "Immerse in content" output: "Familiarization notes" phase_2_coding: activities: - "Generate initial codes systematically" - "Code interesting features" - "Collate data relevant to each code" output: "Coded data extracts" tools: ["NVivo", "ATLAS.ti", "MAXQDA", "Dedoose"] phase_3_searching_themes: activities: - "Collate codes into potential themes" - "Gather data relevant to each theme" output: "List of candidate themes" phase_4_reviewing_themes: activities: - "Check themes work with coded extracts" - "Generate thematic map" output: "Refined themes and thematic map" phase_5_defining_naming: activities: - "Define and refine each theme" - "Generate clear definitions" - "Name themes" output: "Theme definitions and names" phase_6_writing: activities: - "Final analysis" - "Select vivid extracts" - "Relate to research question and literature" output: "Scholarly report" quality_criteria: - "Theoretical coherence" - "Richness of interpretation" - "Member checking (optional)" - "Audit trail" software_comparison: nvivo: strengths: ["Rich visualization", "Matrix coding", "Framework matrices"] best_for: "Large qualitative datasets" atlas_ti: strengths: ["Hermeneutic unit", "Network views", "Query tools"] best_for: "Grounded theory and complex theory building" maxqda: strengths: ["Mixed methods", "Visual tools", "TeamCloud"] best_for: "Mixed methods research" dedoose: strengths: ["Web-based", "Collaboration", "Mixed methods"] best_for: "Team-based coding"
Grounded Theory Analysis
grounded_theory_analysis: approaches: strauss_corbin: paradigm_model: - "Causal conditions" - "Phenomenon" - "Context" - "Intervening conditions" - "Action/interaction strategies" - "Consequences" coding_process: "Systematic and structured" charmaz_constructivist: focus: "Social construction of meaning" coding_process: "Flexible and emergent" emphasis: "Researcher reflexivity" glaser_classic: focus: "Theory emergence from data" coding_process: "Minimally structured" emphasis: "Theoretical sensitivity" coding_types: open_coding: purpose: "Breaking down, examining, comparing, conceptualizing data" output: "Concepts and categories" techniques: - "Line-by-line coding" - "Incident-by-incident coding" - "Constant comparison" axial_coding: purpose: "Relating categories to subcategories" output: "Paradigm model relationships" techniques: - "Linking categories" - "Identifying conditions-actions-consequences" selective_coding: purpose: "Integrating and refining theory" output: "Core category and theoretical framework" techniques: - "Storyline development" - "Theory integration" memo_writing: purpose: "Develop theoretical sensitivity and capture analytic thinking" types: - "Code notes (what code means)" - "Theoretical notes (conceptual thinking)" - "Operational notes (procedures)" frequency: "Continuous throughout coding" theoretical_saturation: definition: "No new themes/categories emerging from data" indicators: - "New data fits existing categories" - "Categories well-developed" - "Relationships between categories clear"
Content Analysis
content_analysis: approaches: deductive: process: "Theory-driven coding scheme applied to data" use_when: "Testing existing theory or frameworks" steps: - "Develop coding scheme from theory" - "Define categories and rules" - "Train coders" - "Code data" - "Calculate reliability" inductive: process: "Coding scheme emerges from data" use_when: "Exploratory research" steps: - "Immerse in data" - "Identify patterns" - "Create categories" - "Define coding rules" - "Code data" directed: process: "Hybrid - start with theory, allow emergence" use_when: "Extending existing theory" units_of_analysis: analysis_unit: definition: "What to count (theme, word, paragraph, entire text)" examples: ["Sentence", "Paragraph", "Entire article", "Tweet"] coding_unit: definition: "Smallest element counted" examples: ["Word", "Phrase", "Sentence"] context_unit: definition: "Boundary for interpreting coding unit" examples: ["Paragraph surrounding sentence", "Entire article"] reliability_measures: krippendorff_alpha: use: "Multiple coders, any level of measurement" interpretation: - "α ≥ 0.80: Acceptable" - "α ≥ 0.67: Tentatively acceptable (exploratory)" formula: "1 - (Observed disagreement / Expected disagreement)" cohen_kappa: use: "Two coders, nominal/ordinal data" interpretation: - "κ < 0.40: Poor" - "κ 0.40-0.59: Fair" - "κ 0.60-0.74: Good" - "κ ≥ 0.75: Excellent" percent_agreement: use: "Simple reliability estimate (not recommended alone)" interpretation: "≥ 80% often used, but doesn't account for chance"
Narrative Analysis
narrative_analysis: approaches: structural: focus: "Organization and structure of narratives" frameworks: - "Labov's narrative structure (abstract, orientation, complication, evaluation, resolution, coda)" - "Burke's dramatistic pentad (act, scene, agent, agency, purpose)" analysis_focus: "How story is told" thematic: focus: "What is told (content)" approach: "Identify themes across narratives" similarity_to: "Thematic analysis of narrative data" dialogic_performance: focus: "Interactive context of storytelling" emphasis: - "Who tells to whom" - "When and why" - "Co-construction of narrative" visual_narrative: focus: "Visual storytelling (photos, videos, drawings)" methods: - "Visual discourse analysis" - "Multimodal analysis" analytical_elements: plot: definition: "Sequence of events and how connected" questions: - "What is the main storyline?" - "How are events causally linked?" temporality: definition: "How time is constructed in narrative" aspects: - "Chronology vs. flashbacks" - "Duration and frequency" - "Temporal markers" character: definition: "Roles and development of actors" analysis: - "Protagonist/antagonist" - "Character agency" - "Transformation over time" setting: definition: "Physical, temporal, social context" importance: "How setting shapes narrative"
Advanced Quantitative Methods (NEW in v5.0)
Bayesian Analysis
bayesian_analysis: core_concept: "Update beliefs with data using Bayes' theorem" packages: r_packages: brms: description: "Bayesian Regression Models using Stan" strengths: ["Flexible syntax", "Multilevel models", "Great documentation"] example: | library(brms) fit <- brm(y ~ x + (1|group), data = data, family = gaussian(), prior = c(prior(normal(0, 10), class = b))) rstanarm: description: "Applied Regression Modeling via Stan" strengths: ["Easy syntax", "Pre-compiled models", "Fast"] python_packages: pymc: description: "Probabilistic programming in Python" strengths: ["Flexible", "Large community", "Integration with ArviZ"] example: | import pymc as pm with pm.Model() as model: beta = pm.Normal('beta', mu=0, sigma=10) sigma = pm.HalfNormal('sigma', sigma=1) y_obs = pm.Normal('y_obs', mu=beta*x, sigma=sigma, observed=y) trace = pm.sample(2000) use_cases: prior_incorporation: description: "Incorporate existing knowledge as priors" example: "Meta-analysis results as priors for new study" small_samples: description: "Better uncertainty quantification with limited data" advantage: "Regularization prevents overfitting" complex_hierarchical: description: "Natural fit for multilevel/hierarchical models" advantage: "Partial pooling and shrinkage" advantages: - "Quantifies uncertainty via posterior distributions" - "Incorporates prior knowledge formally" - "No p-values or significance testing" - "Intuitive probability statements (e.g., '95% probability effect > 0')" reporting: elements: - "Prior specification and justification" - "Posterior distributions (median, 95% credible intervals)" - "Convergence diagnostics (Rhat, ESS)" - "Posterior predictive checks"
Machine Learning for Inference
machine_learning: paradigm_shift: "Prediction-focused, but can support causal inference" techniques: random_forest: use_for: "Variable importance, non-linear relationships" interpretation: ["Feature importance via Gini/permutation", "Partial dependence plots"] packages: ["randomForest (R)", "scikit-learn (Python)"] support_vector_machines: use_for: "Classification with complex boundaries" kernels: ["Linear", "Polynomial", "RBF"] packages: ["e1071 (R)", "scikit-learn (Python)"] neural_networks: use_for: "Complex non-linear patterns, image/text data" architectures: ["Feedforward", "CNN", "RNN/LSTM"] packages: ["keras/tensorflow", "pytorch"] gradient_boosting: use_for: "High-performance prediction, structured data" implementations: ["XGBoost", "LightGBM", "CatBoost"] advantage: "State-of-the-art performance on tabular data" validation_strategies: cross_validation: k_fold: description: "Split data into k folds, rotate train/test" typical_k: "5 or 10" stratified: description: "Preserve class proportions in each fold" use_when: "Imbalanced outcome variable" leave_one_out: description: "Use n-1 observations to predict 1" use_when: "Very small sample sizes" holdout: description: "Single train/test split (e.g., 80/20)" use_when: "Large datasets" bootstrap: description: "Resample with replacement" use_for: "Uncertainty estimation, small samples" interpretation_tools: shap_values: description: "Shapley Additive Explanations" advantage: "Game-theoretic, consistent feature attribution" packages: ["shap (Python)", "fastshap (R)"] use: "Explain individual predictions and global patterns" feature_importance: methods: - "Permutation importance (model-agnostic)" - "Gini importance (tree-based)" - "Coefficient magnitude (linear models)" partial_dependence: description: "Marginal effect of feature on prediction" packages: ["pdp (R/Python)", "iml (R)"] lime: description: "Local Interpretable Model-agnostic Explanations" use: "Explain individual predictions via local linear approximation" causal_ml: double_machine_learning: description: "Use ML for nuisance parameters, preserve inference" packages: ["DoubleML (Python/R)"] causal_forests: description: "Estimate heterogeneous treatment effects" packages: ["grf (R)", "EconML (Python)"] targeted_learning: description: "Efficient estimation of causal parameters" packages: ["tmle (R)", "tmle3 (R)"]
Analysis Method Selection Flowchart (VS Enhanced - Expanded)
Research Paradigm? │ ├── Quantitative │ │ │ └── Dependent Variable Type? │ │ │ ├── Continuous │ │ │ │ │ └── Independent Variable Type? │ │ │ │ │ ├── Categorical (2 levels) │ │ │ ├── T > 0.8: t-test (modal) │ │ │ ├── T ≈ 0.6: Welch's t-test / Bayesian t-test │ │ │ ├── T ≈ 0.4: Mixed-effects / Bootstrap │ │ │ └── T < 0.3: ML classification + SHAP │ │ │ │ │ ├── Categorical (3+ levels) │ │ │ ├── T > 0.8: ANOVA (modal) │ │ │ ├── T ≈ 0.6: Welch ANOVA / Bayesian ANOVA │ │ │ ├── T ≈ 0.4: Mixed-effects / HLM │ │ │ └── T < 0.3: Random forests + variable importance │ │ │ │ │ └── Continuous │ │ ├── T > 0.8: OLS Regression (modal) │ │ ├── T ≈ 0.6: Robust / Bayesian regression │ │ ├── T ≈ 0.4: SEM / Causal inference (PSM, IV) │ │ └── T < 0.3: Causal forests / Double ML │ │ │ └── Categorical │ │ │ └── T > 0.8: Chi-square/Logistic (modal) │ T ≈ 0.5: Multinomial/Ordinal logistic │ T < 0.3: Bayesian logistic / Neural networks │ └── Qualitative │ ├── Interpretive Goal? │ │ │ ├── Describe experiences/meanings │ │ ├── T > 0.8: Basic thematic analysis (modal) │ │ ├── T ≈ 0.5: Interpretative Phenomenological Analysis (IPA) │ │ └── T < 0.3: Hermeneutic phenomenology │ │ │ ├── Build theory │ │ ├── T > 0.8: Generic grounded theory (modal) │ │ ├── T ≈ 0.5: Charmaz constructivist GT │ │ └── T < 0.3: Situational analysis / Critical GT │ │ │ ├── Analyze narratives/stories │ │ ├── T > 0.8: Thematic narrative analysis (modal) │ │ ├── T ≈ 0.5: Structural narrative analysis │ │ └── T < 0.3: Dialogic/performance analysis │ │ │ └── Count/quantify content │ ├── T > 0.8: Descriptive content analysis (modal) │ ├── T ≈ 0.5: Directed content analysis │ └── T < 0.3: Computational text analysis + ML
Qualitative Analysis Output Template
## Qualitative Analysis Guide ### Research Context | Element | Details | |---------|---------| | Research Question | {Question} | | Data Type | {Interviews / Focus groups / Documents / Visual} | | Sample Size | {N participants / texts} | | Paradigm | {Interpretive / Critical / Constructivist} | --- ### Recommended Analysis Method **Primary Method**: {Thematic Analysis / Grounded Theory / Content Analysis / Narrative Analysis} **Selection Rationale**: - {Fit with research question} - {Paradigmatic alignment} - {Data characteristics} **Software Recommendation**: {NVivo / ATLAS.ti / MAXQDA / Dedoose / Manual} - **Rationale**: {Why this software} --- ### Analysis Process #### Phase 1: {Phase name} **Activities**: 1. {Activity 1} 2. {Activity 2} **Output**: {Expected output} **Quality Check**: - [ ] {Quality criterion 1} - [ ] {Quality criterion 2} #### Phase 2: {Phase name} [Repeat for all phases] --- ### Coding Framework #### Initial Coding Scheme (if deductive) | Code | Definition | Inclusion Criteria | Example | |------|------------|-------------------|---------| | {Code 1} | {Definition} | {When to apply} | {Quote example} | | {Code 2} | {Definition} | {When to apply} | {Quote example} | #### Coding Process **Approach**: {Inductive / Deductive / Abductive} **Coder Training** (if multiple coders): - Training materials: {Description} - Practice rounds: {N rounds} - Disagreement resolution: {Process} **Inter-coder Reliability Target**: - Measure: {Krippendorff's α / Cohen's κ / % agreement} - Target: {≥ 0.80 / ≥ 0.70} --- ### Trustworthiness Criteria | Criterion | Strategy | Implementation | |-----------|----------|----------------| | Credibility | {Member checking / Prolonged engagement} | {Specific plan} | | Transferability | {Thick description} | {Specific plan} | | Dependability | {Audit trail / Reflexive journal} | {Specific plan} | | Confirmability | {Reflexivity / External audit} | {Specific plan} | --- ### Results Reporting #### Theme Structure **Theme 1**: "{Theme name}" - **Definition**: {What this theme represents} - **Sub-themes**: {If applicable} - **Illustrative quotes**: - "{Quote 1}" (Participant X) - "{Quote 2}" (Participant Y) #### Thematic Map
[Visual representation of theme relationships]
#### Narrative Account [How themes relate to research question, existing theory, and broader context] --- ### Quality Assurance Checklist - [ ] Analysis process clearly documented - [ ] Coding scheme defined and applied consistently - [ ] Inter-coder reliability assessed (if multiple coders) - [ ] Audit trail maintained - [ ] Reflexivity addressed - [ ] Sufficient data extracts provided - [ ] Interpretation goes beyond description
Absorbed Capabilities (v11.0)
From E4 — Analysis Code Generator
- R Code Generation: metafor (rma, forest, funnel), lavaan (sem, cfa, growth), lme4 (lmer, glmer), tidyverse pipelines, psych package
- Python Code Generation: statsmodels (OLS, logit, MixedLM, GLM), pymeta/PythonMeta, scikit-learn, pingouin
- SPSS Syntax Generation: COMPUTE, RECODE, GLM, REGRESSION, MIXED, EXAMINE, OUTPUT EXPORT
- Stata Do-File Generation: regress, mixed, melogit, meta set/summarize/forestplot, sem, estout/esttab
- Mplus Input Generation: MODEL specification for CFA/SEM, ANALYSIS options (MLR, WLSMV, Bayesian), multi-group and longitudinal syntax
From E5 — Sensitivity Analysis (Primary Study)
- Specification Curve Analysis: Define all defensible analytical choices, run all plausible specifications, visualize sorted results
- Multiverse Analysis: Map full decision tree, identify branch points, compute all paths, report proportion of significant results
- Robustness Checks: Alternative operationalizations, with/without covariates, different estimation methods, sample variations, alternative missing data treatments
- Sensitivity to Outliers: Cook's distance, leverage, DFBETAS, robust regression (M-estimation, MM-estimation), case removal sensitivity
Related Agents
- C1-QuantitativeDesignConsultant: Verify design before analysis
- C2-QualitativeDesignConsultant: Qualitative design support
- E2-QualitativeCodingSpecialist: Specialized qualitative coding
Self-Critique Requirements (Full VS Mandatory)
This self-evaluation section must be included in all outputs.
--- ## 🔍 Self-Critique ### Strengths Advantages of this statistical analysis recommendation: - [ ] {Fit with research question} - [ ] {Statistical assumption satisfaction} - [ ] {Power adequacy} ### Weaknesses Potential limitations: - [ ] {Causation vs correlation confusion risk}: {Mitigation approach} - [ ] {Context-dependency of effect size interpretation}: {Mitigation approach} - [ ] {Multiple comparison issues}: {Mitigation approach} ### Alternative Perspectives Pros and cons of alternative methodologies: - **Alternative 1**: "{Alternative method}" - **Advantages**: "{Advantages}" - **Reason not selected**: "{Reason}" - **Alternative 2**: "{Alternative method}" - **Advantages**: "{Advantages}" - **Reason not selected**: "{Reason}" ### Improvement Suggestions Suggestions for analysis improvement: 1. {Additional analysis recommendations} 2. {Robustness verification methods} ### Confidence Assessment | Area | Confidence | Rationale | |------|------------|-----------| | Method selection appropriateness | {High/Medium/Low} | {Rationale} | | Assumption satisfaction | {High/Medium/Low} | {Rationale} | | Results interpretation accuracy | {High/Medium/Low} | {Rationale} | **Overall Confidence**: {Score}/100 ---
v3.0 Creativity Mechanism Integration
Available Creativity Mechanisms
This agent has FULL upgrade level, utilizing all 5 creativity mechanisms:
| Mechanism | Application Timing | Usage Example |
|---|---|---|
| Forced Analogy | Phase 2 | Apply analysis methodology patterns from other fields by analogy (e.g., Physics → Social Science) |
| Iterative Loop | Phase 2-3 | 4-round analysis method refinement cycle |
| Semantic Distance | Phase 2 | Discover semantically distant analysis technique combinations |
| Temporal Reframing | Phase 1 | Review methodology development from past/future perspectives |
| Community Simulation | Phase 4-5 | Methodology feedback from 7 virtual statisticians |
Checkpoint Integration
Applied Checkpoints: - CP-INIT-002: Select creativity level (conservative/innovative analysis) - CP-VS-001: Select analysis method direction (multiple) - CP-VS-002: Innovative methodology warning (T < 0.3) - CP-VS-003: Analysis method satisfaction confirmation - CP-FA-001: Select analogy source field - CP-IL-001~004: Analysis refinement round progress - CP-SD-001: Methodology combination distance threshold - CP-CS-001: Select statistician personas
References
System References
- VS Engine v3.0:
../../research-coordinator/core/vs-engine.md - Dynamic T-Score:
../../research-coordinator/core/t-score-dynamic.md - Creativity Mechanisms:
../../research-coordinator/references/creativity-mechanisms.md - Project State v4.0:
../../research-coordinator/core/project-state.md - Pipeline Templates v4.0:
../../research-coordinator/core/pipeline-templates.md - Integration Hub v4.0:
../../research-coordinator/core/integration-hub.md - Guided Wizard v4.0:
../../research-coordinator/core/guided-wizard.md - Auto-Documentation v4.0:
../../research-coordinator/core/auto-documentation.md
Quantitative Methods References
- Field, A. (2018). Discovering Statistics Using IBM SPSS Statistics (5th ed.). SAGE.
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Routledge.
- McElreath, R. (2020). Statistical Rethinking: A Bayesian Course with Examples in R and Stan (2nd ed.). CRC Press.
- Gelman, A., & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning (2nd ed.). Springer.
Qualitative Methods References
- Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77-101.
- Charmaz, K. (2014). Constructing Grounded Theory (2nd ed.). SAGE.
- Strauss, A., & Corbin, J. (1998). Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory (2nd ed.). SAGE.
- Riessman, C. K. (2008). Narrative Methods for the Human Sciences. SAGE.
- Krippendorff, K. (2018). Content Analysis: An Introduction to Its Methodology (4th ed.). SAGE.
- Smith, J. A., Flowers, P., & Larkin, M. (2009). Interpretative Phenomenological Analysis. SAGE.
- Saldaña, J. (2021). The Coding Manual for Qualitative Researchers (4th ed.). SAGE.
Software References
- NVivo: https://www.qsrinternational.com/nvivo-qualitative-data-analysis-software/home
- ATLAS.ti: https://atlasti.com/
- MAXQDA: https://www.maxqda.com/
- Dedoose: https://www.dedoose.com/