Awesome-Agent-Skills-for-Empirical-Research open-researcher-guide

Open pipeline for generating deep research trajectories with LLMs

install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/research/deep-research/open-researcher-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-open-researcher-g && rm -rf "$T"
manifest: skills/43-wentorai-research-plugins/skills/research/deep-research/open-researcher-guide/SKILL.md
source content

OpenResearcher Guide

Overview

OpenResearcher is a fully open pipeline for long-horizon deep research trajectory synthesis. It breaks complex research questions into sub-questions, iteratively searches and reads literature, builds internal knowledge representations, and synthesizes comprehensive answers. Unlike single-shot approaches, it models the researcher's thought process — reading, questioning, connecting, and refining understanding over multiple rounds.

Pipeline Stages

1. Question Decomposition

from open_researcher import OpenResearcher

researcher = OpenResearcher(llm_provider="anthropic")

# Complex research question
result = researcher.research(
    "How do retrieval-augmented generation systems handle "
    "knowledge conflicts between parametric and retrieved knowledge, "
    "and what are the current mitigation strategies?"
)

# Automatically decomposes into sub-questions:
# SQ1: What types of knowledge conflicts occur in RAG?
# SQ2: How are conflicts detected?
# SQ3: What resolution strategies exist?
# SQ4: How effective are these strategies?

2. Iterative Search and Reading

# Each sub-question triggers:
# - Academic search (OpenAlex, arXiv)
# - Paper reading (abstract + key sections)
# - Evidence extraction
# - Follow-up question generation

# Configuration
researcher = OpenResearcher(
    search_backends=["openalex", "arxiv"],
    max_iterations=5,           # Research rounds per sub-question
    papers_per_iteration=10,    # Papers to read per round
    follow_up_questions=True,   # Generate follow-up questions
)

3. Knowledge Graph Building

# Internally builds a knowledge representation:
# - Claims linked to source papers
# - Relationships between concepts
# - Contradictions flagged

# Access the knowledge graph
kg = result.knowledge_graph
print(f"Concepts: {len(kg.nodes)}")
print(f"Relations: {len(kg.edges)}")
print(f"Contradictions: {len(kg.contradictions)}")

4. Synthesis and Report

# Multi-section synthesis
report = result.report

# Sections:
# 1. Introduction and scope
# 2. Sub-question answers with evidence
# 3. Cross-cutting themes
# 4. Open questions and future directions
# 5. Full bibliography

report.save("research_report.md")
report.export_bibliography("refs.bib")

Configuration

researcher = OpenResearcher(
    llm_provider="anthropic",
    model="claude-sonnet-4-20250514",
    search_config={
        "backends": ["openalex", "arxiv"],
        "max_results_per_query": 20,
    },
    reading_config={
        "sections": ["abstract", "introduction", "methods", "conclusion"],
        "max_tokens_per_paper": 3000,
    },
    synthesis_config={
        "style": "academic",           # academic, technical, accessible
        "include_contradictions": True,
        "cite_inline": True,
    },
)

Trajectory Inspection

# Inspect the research trajectory
trajectory = result.trajectory

for step in trajectory:
    print(f"Round {step.round}: {step.action}")
    print(f"  Query: {step.query}")
    print(f"  Papers read: {step.papers_read}")
    print(f"  Key findings: {step.findings[:100]}...")
    print(f"  Follow-ups: {step.follow_up_questions}")

Use Cases

  1. Literature surveys: Comprehensive multi-round research
  2. Research proposals: Evidence gathering for grant applications
  3. State-of-the-art reports: Current landscape analysis
  4. Tutorial generation: Deep topic explanations with citations

References