Awesome-Agent-Skills-for-Empirical-Research genomas-guide

Automate gene expression analysis with the GenoMAS multi-agent system

install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/domains/biomedical/genomas-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-genomas-guide && rm -rf "$T"
manifest: skills/43-wentorai-research-plugins/skills/domains/biomedical/genomas-guide/SKILL.md
source content

GenoMAS Guide

Overview

GenoMAS (Genomics Multi-Agent System) is a minimalist multi-agent framework for automating scientific analysis workflows, particularly gene expression analysis. It orchestrates specialized agents for data retrieval, preprocessing, differential expression analysis, pathway enrichment, and visualization — turning a natural language research question into a complete bioinformatics pipeline.

Installation

pip install genomas
# Or from source
git clone https://github.com/futianfan/GenoMAS.git
cd GenoMAS && pip install -e .

Core Workflow

Natural Language to Pipeline

from genomas import GenoMAS

geno = GenoMAS(llm_provider="anthropic")

# Describe analysis in natural language
result = geno.analyze(
    "Compare gene expression between tumor and normal tissue "
    "in the TCGA breast cancer dataset. Identify differentially "
    "expressed genes and run pathway enrichment analysis."
)

# GenoMAS automatically:
# 1. Retrieves TCGA-BRCA data via GDC API
# 2. Normalizes and filters expression data
# 3. Runs DESeq2-style differential expression
# 4. Performs GO and KEGG pathway enrichment
# 5. Generates volcano plots and heatmaps

Agent Roles

AgentResponsibility
Data AgentRetrieves datasets from GEO, TCGA, ArrayExpress
Preprocessing AgentQuality control, normalization, filtering
Analysis AgentDifferential expression, clustering, PCA
Enrichment AgentGO, KEGG, MSigDB pathway analysis
Visualization AgentPlots, heatmaps, volcano plots
Report AgentGenerates methods section and results summary

Step-by-Step Usage

from genomas import DataAgent, AnalysisAgent, EnrichmentAgent

# Step 1: Retrieve data
data_agent = DataAgent()
dataset = data_agent.fetch("GSE12345", platform="RNA-seq")

# Step 2: Differential expression
analysis = AnalysisAgent()
de_results = analysis.differential_expression(
    dataset,
    group_col="condition",
    case="tumor",
    control="normal",
    method="deseq2",
)

# Step 3: Filter significant genes
sig_genes = de_results[
    (de_results["padj"] < 0.05) &
    (abs(de_results["log2FoldChange"]) > 1)
]
print(f"Found {len(sig_genes)} differentially expressed genes")

# Step 4: Pathway enrichment
enrichment = EnrichmentAgent()
pathways = enrichment.run(
    gene_list=sig_genes["gene_symbol"].tolist(),
    databases=["GO_BP", "KEGG", "Reactome"],
)

# Step 5: Visualize
from genomas.viz import volcano_plot, pathway_barplot
volcano_plot(de_results, output="volcano.png")
pathway_barplot(pathways, top_n=20, output="pathways.png")

Supported Analyses

AnalysisMethod
Differential expressionDESeq2, edgeR, limma-voom
ClusteringHierarchical, k-means, UMAP
PCAPrincipal component analysis
GO enrichmentGene Ontology term enrichment
KEGG pathwayKEGG pathway mapping
GSEAGene Set Enrichment Analysis
Survival analysisKaplan-Meier, Cox regression

Data Sources

SourceData type
GEO (NCBI)Microarray, RNA-seq
TCGACancer genomics
GTExNormal tissue expression
ArrayExpressEuropean expression data

References

  • GenoMAS GitHub
  • Love, M.I. et al. (2014). "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2." Genome Biology 15(12).