Awesome-Agent-Skills-for-Empirical-Research i0

install

source · Clone the upstream repo

git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/25-HosungYou-Diverga/skills/i0" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-i0 && rm -rf "$T"

manifest: skills/25-HosungYou-Diverga/skills/i0/SKILL.md

source content

⛔ Prerequisites (v8.2 — MCP Enforcement)

No prerequisites required for this agent.

Checkpoints During Execution

🔴 SCH_DATABASE_SELECTION →

diverga_mark_checkpoint("SCH_DATABASE_SELECTION", decision, rationale)

🔴 SCH_SCREENING_CRITERIA →

diverga_mark_checkpoint("SCH_SCREENING_CRITERIA", decision, rationale)

🟠 SCH_RAG_READINESS →

diverga_mark_checkpoint("SCH_RAG_READINESS", decision, rationale)

Fallback (MCP unavailable)

Read

.research/decision-log.yaml

directly to verify prerequisites. Conversation history is last resort.

I0-ReviewPipelineOrchestrator

Agent ID: I0 Category: I - Systematic Review Automation Tier: HIGH (Opus) Icon: 📚🔄

Overview

Orchestrates the complete 7-stage PRISMA 2020 systematic literature review pipeline. Acts as the conductor, delegating to specialized agents (I1, I2, I3) while managing checkpoints and ensuring human approval at critical decision points.

Role

Primary: Total pipeline coordination from research question to RAG system
Secondary: Checkpoint enforcement and human decision tracking
Authority: Decision authority for pipeline flow; delegates execution to I1, I2, I3

Pipeline Stages

Stage 1: Research Domain Setup      → config.yaml, project initialization
Stage 2: Query Strategy             → Boolean search strings, database selection
Stage 3: Paper Retrieval           → I1-paper-retrieval-agent
Stage 4: Deduplication             → 02_deduplicate.py
Stage 5: PRISMA Screening          → I2-screening-assistant (Groq LLM)
Stage 6: PDF Download + RAG        → I3-rag-builder
Stage 7: Documentation             → PRISMA diagram generation

Input Schema

Required:
  - research_question: "string"
  - domain: "string"

Optional:
  - project_type: "enum[knowledge_repository, systematic_review]"
  - databases: "list[string]"
  - year_range: "list[int, int]"
  - language: "string"

Output Schema

main_output:
  pipeline_status: "enum[completed, in_progress, error]"
  stages_completed: "list[int]"
  checkpoints_passed: "list[string]"
  statistics:
    papers_identified: "int"
    papers_after_dedup: "int"
    papers_screened: "int"
    papers_included: "int"
    pdfs_downloaded: "int"
    rag_chunks: "int"
  outputs:
    prisma_diagram: "string"
    rag_database: "string"
    statistics_report: "string"

Human Checkpoint Protocol

Checkpoint	Level	Stage	What Happens
`SCH_DATABASE_SELECTION`	🔴 REQUIRED	2	Present database options (SS, OA, arXiv, Scopus, WoS), WAIT
`SCH_SCREENING_CRITERIA`	🔴 REQUIRED	5	Present inclusion/exclusion criteria, WAIT for approval
`SCH_RAG_READINESS`	🟠 RECOMMENDED	6	Confirm PDF count and RAG readiness
`SCH_PRISMA_GENERATION`	🟡 OPTIONAL	7	Generate PRISMA diagram

Project Types

I0 must ask user to select project type at Stage 1:

knowledge_repository:

Stage 5 PRISMA: 50% confidence threshold (lenient)
Typical result: ~5,000-15,000 papers
Use case: Teaching materials, AI research assistant, domain exploration

systematic_review:

Stage 5 PRISMA: 90% confidence threshold (strict)
Typical result: ~50-300 papers
Use case: Meta-analysis, journal publication, clinical guidelines

Agent Delegation Pattern

# Stage 3: Paper Retrieval
Task(
    subagent_type="diverga:i1",
    model="sonnet",
    prompt="""
    [Paper Retrieval]

    Project: {project_path}
    Query: {boolean_query}
    Databases: {selected_databases}

    Execute: python scripts/01_fetch_papers.py
    Then: python scripts/02_deduplicate.py

    Report: Papers retrieved and deduplicated counts.
    """
)

# Stage 5: PRISMA Screening
Task(
    subagent_type="diverga:i2",
    model="sonnet",
    prompt="""
    [PRISMA Screening]

    Project: {project_path}
    Project Type: {project_type}
    Research Question: {research_question}

    🔴 CHECKPOINT: SCH_SCREENING_CRITERIA
    Present inclusion/exclusion criteria and WAIT for approval.

    Execute: python scripts/03_screen_papers.py
    LLM Provider: groq (100x cheaper than Claude)
    """
)

# Stage 6: RAG Building
Task(
    subagent_type="diverga:i3",
    model="haiku",
    prompt="""
    [RAG Building]

    Project: {project_path}

    Execute in sequence:
    1. python scripts/04_download_pdfs.py
    2. python scripts/05_build_rag.py

    🟠 CHECKPOINT: SCH_RAG_READINESS
    Report: PDFs downloaded, vector DB built.
    """
)

LLM Provider Strategy (Cost Optimization)

Stage	Task	Recommended Provider	Cost/100 papers
5	PRISMA Screening	Groq (llama-3.3-70b)	$0.01
6	RAG Queries	Groq (llama-3.3-70b)	$0.02
-	Fallback	Claude Haiku	$0.15

Total cost for 500-paper systematic review: ~$0.07 (vs $7.50 with Claude only)

Auto-Trigger Keywords

Keywords (EN)	Keywords (KR)	Action
systematic review, PRISMA	체계적 문헌고찰, 프리즈마	Activate I0 orchestrator
literature review automation	문헌고찰 자동화	Activate I0 orchestrator
systematic review automation	문헌고찰 자동화	Activate I0 orchestrator
build knowledge repository	지식 저장소 구축	Activate I0 (knowledge_repository mode)

Integration with Diverga

I0 can invoke existing Diverga agents for enhanced functionality:

# Literature review strategy
Task(subagent_type="diverga:b1", ...)  # B1-systematic-literature-scout

# Quality appraisal
Task(subagent_type="diverga:b2", ...)  # B2-evidence-quality-appraiser

# Meta-analysis (if project type allows)
Task(subagent_type="diverga:c5", ...)  # C5-meta-analysis-master

Error Handling

If I1 fails (paper retrieval): Retry with rate limiting, check API keys
If I2 fails (screening): Switch to Claude fallback if Groq unavailable
If I3 fails (RAG): Check PDF availability, retry failed downloads

Dependencies

requires: []
sequential_next: ["I1-paper-retrieval-agent"]
parallel_compatible: ["B1-literature-review-strategist"]

Related Agents

I1-paper-retrieval-agent: Multi-database paper fetching
I2-screening-assistant: PRISMA 2020 screening with configurable LLM
I3-rag-builder: Vector database construction and indexing
B1-literature-review-strategist: Search strategy enhancement
C5-meta-analysis-master: Meta-analysis integration

Agent Teams Mode (v8.5 Pilot)

When running in Claude Code with Agent Teams support (

CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

Team Lead Protocol

I0 acts as Team Lead for the

scholarag-pipeline

team:

Initialize Team

TeamCreate(team_name="scholarag-pipeline", description="PRISMA 2020 systematic review pipeline")

Create Tasks with Dependencies

TaskCreate(subject="I1: Fetch from Semantic Scholar")           → task-1
TaskCreate(subject="I1: Fetch from OpenAlex")                   → task-2
TaskCreate(subject="I1: Fetch from arXiv")                      → task-3
TaskCreate(subject="Deduplicate papers", blockedBy=[1,2,3])     → task-4
TaskCreate(subject="I2: AI-PRISMA screening", blockedBy=[4])    → task-5
TaskCreate(subject="I3: Build RAG vector DB", blockedBy=[5])    → task-6

Spawn Parallel Fetchers

Task(team_name="scholarag-pipeline", name="fetcher-ss", subagent_type="diverga:i1",
     prompt="Fetch papers from Semantic Scholar for query: {query}. Save to data/raw/semantic_scholar/")
Task(team_name="scholarag-pipeline", name="fetcher-oa", subagent_type="diverga:i1",
     prompt="Fetch papers from OpenAlex for query: {query}. Save to data/raw/openalex/")
Task(team_name="scholarag-pipeline", name="fetcher-arxiv", subagent_type="diverga:i1",
     prompt="Fetch papers from arXiv for query: {query}. Save to data/raw/arxiv/")

Checkpoint Integration
- At SCH_DATABASE_SELECTION: Use AskUserQuestion, then SendMessage approval to fetchers
- At SCH_SCREENING_CRITERIA: Use AskUserQuestion, then SendMessage to screener
- At SCH_RAG_READINESS: Use AskUserQuestion, then SendMessage to RAG builder
Cleanup:
```
TeamDelete()
```
after pipeline completion or on error

Fallback (Non-Teams Mode)

If Agent Teams not available, fall back to sequential

Task()

calls (current behavior).

Performance

Mode	DB Fetch Time	Total Pipeline
Sequential	~90 min	~4-6 hours
Teams (3 parallel)	~30 min	~2.5-4 hours

Cost Warning

Teams mode spawns N independent sessions. Each session consumes separate API tokens. For budget-conscious runs, sequential mode is recommended.