Awesome-Agent-Skills-for-Empirical-Research llm-scientific-discovery-guide

Survey of LLM agents for biomedical scientific discovery

install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/research/deep-research/llm-scientific-discovery-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-llm-scientific-di && rm -rf "$T"
manifest: skills/43-wentorai-research-plugins/skills/research/deep-research/llm-scientific-discovery-guide/SKILL.md
source content

LLM Agents for Scientific Discovery Guide

Overview

A curated survey of how LLM-based agents are being applied to scientific discovery, with a focus on biomedical research. Covers hypothesis generation, experiment design, lab automation, literature synthesis, and multi-agent scientific collaboration. Tracks papers, tools, and frameworks across the spectrum from fully autonomous to human-in-the-loop systems.

Landscape

LLM Agents for Scientific Discovery
├── Hypothesis Generation
│   ├── Literature-based (gap identification)
│   ├── Data-driven (pattern discovery)
│   └── Analogy-based (cross-domain transfer)
├── Experiment Design
│   ├── Protocol generation
│   ├── Parameter optimization
│   └── Control selection
├── Lab Automation
│   ├── Robot control (self-driving labs)
│   ├── Equipment programming
│   └── Data collection orchestration
├── Analysis & Interpretation
│   ├── Statistical analysis
│   ├── Visualization
│   └── Result interpretation
└── Communication
    ├── Paper writing
    ├── Presentation generation
    └── Peer review simulation

Key Systems

SystemDomainCapability
AI ScientistML/AIFull paper generation pipeline
ChemCrowChemistryTool-augmented chemical reasoning
CoscientistChemistryAutonomous experiment execution
BioPlannerBiologyExperiment protocol generation
MedAgentMedicineClinical trial analysis
GenAgentGenomicsGene expression analysis
DrugAgentPharmaDrug interaction prediction

Hypothesis Generation

# LLM-based hypothesis generation pattern
from scientific_agent import HypothesisGenerator

generator = HypothesisGenerator(
    llm_provider="anthropic",
    knowledge_sources=["pubmed", "openalex"],
)

hypotheses = generator.generate(
    domain="oncology",
    context="Recent findings show that gut microbiome "
            "composition correlates with immunotherapy response",
    constraints=[
        "Must be testable in vitro",
        "Should involve specific bacterial species",
        "Must have measurable endpoints",
    ],
    num_hypotheses=5,
)

for h in hypotheses:
    print(f"\nHypothesis: {h.statement}")
    print(f"  Rationale: {h.rationale}")
    print(f"  Supporting evidence: {len(h.evidence)} papers")
    print(f"  Novelty score: {h.novelty_score:.2f}")
    print(f"  Feasibility: {h.feasibility}")

Self-Driving Lab Integration

# Agent controlling automated experiments
from scientific_agent import LabAgent

agent = LabAgent(
    llm_provider="anthropic",
    equipment=["plate_reader", "liquid_handler", "incubator"],
    safety_constraints=["bsl2", "max_volume_1ml"],
)

# Design and run experiment
result = agent.run_experiment(
    objective="Determine IC50 of compound X against cell line Y",
    protocol_type="dose_response",
    parameters={
        "compound": "Compound_X",
        "cell_line": "HeLa",
        "concentrations": "serial_dilution",
        "replicates": 3,
        "readout": "cell_viability",
    },
)

print(f"IC50: {result.ic50:.2f} uM")
print(f"R-squared: {result.r_squared:.3f}")
result.plot_dose_response("dose_response.pdf")

Multi-Agent Scientific Collaboration

# Agents with different scientific roles
from scientific_agent import ScientificTeam

team = ScientificTeam(
    agents={
        "PI": {"role": "research_director",
               "expertise": "oncology"},
        "Experimentalist": {"role": "experiment_design",
                           "expertise": "cell_biology"},
        "Analyst": {"role": "data_analysis",
                   "expertise": "biostatistics"},
        "Writer": {"role": "manuscript_writing",
                  "expertise": "scientific_communication"},
    },
)

# Collaborative research cycle
project = team.start_project(
    title="Microbiome-immunotherapy interaction study",
    timeline_weeks=12,
)

# Agents collaborate: PI directs → Experimentalist designs →
# Analyst processes → Writer documents

Reading Roadmap

### Foundational Papers
1. "The AI Scientist" (Lu et al., 2024) — Fully automated ML research
2. "ChemCrow" (Bran et al., 2023) — Chemistry tool-use agent
3. "Coscientist" (Boiko et al., 2023) — Autonomous chemical research
4. "BioPlanner" (Biswas et al., 2024) — Biology protocol generation

### Surveys
5. "Scientific Discovery in the Age of AI" (Wang et al., 2023)
6. "Foundation Models for Science" (Bommasani et al., 2022)
7. "LLM Agents: A Survey" (multiple, 2024)

### Ethics & Limitations
8. "Dual-use concerns of AI in biology" (Sandbrink, 2023)
9. "Can LLMs Generate Novel Research Ideas?" (Si et al., 2024)

Use Cases

  1. Literature mining: Automated hypothesis from research gaps
  2. Experiment automation: Self-driving lab orchestration
  3. Drug discovery: Multi-agent screening and optimization
  4. Research planning: Protocol and proposal generation
  5. Scientific writing: Paper drafting with verified claims

References