BioSkills bio-pathway-reactome
Reactome pathway enrichment using ReactomePA package. Use when analyzing gene lists against Reactome's curated peer-reviewed pathway database. Performs over-representation analysis and GSEA with visualization and pathway hierarchy exploration.
git clone https://github.com/GPTomics/bioSkills
T=$(mktemp -d) && git clone --depth=1 https://github.com/GPTomics/bioSkills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/pathway-analysis/reactome-pathways" ~/.claude/skills/gptomics-bioskills-bio-pathway-reactome && rm -rf "$T"
pathway-analysis/reactome-pathways/SKILL.mdVersion Compatibility
Reference examples tested with: R stats (base), ReactomePA 1.46+, clusterProfiler 4.10+
Before using code patterns, verify installed versions match. If versions differ:
- R:
thenpackageVersion('<pkg>')
to verify parameters?function_name
If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Reactome Pathway Enrichment
When to Use Reactome
| Scenario | Reactome? | Alternative |
|---|---|---|
| Signaling pathway detail (reaction-level) | Yes -- best choice | KEGG (pathway-level only) |
| Metabolic pathway focus | Supplement | KEGG has stronger metabolic coverage |
| Reproducibility / open license required | Yes (CC0) | WikiPathways (CC0) |
| Non-model organism (bacteria, plants) | No (7 species only) | KEGG (8,000+ species) |
| Non-human model organism (mouse, rat, fly) | Caution | Annotations are computationally inferred via orthology from human; may contain errors |
Reactome pathways are curated by PhD-level biologists and externally peer-reviewed, making them the highest-quality curated pathway database. Human is the primary species; all others are computationally inferred.
Core Pattern - Over-Representation Analysis
Goal: Identify Reactome pathways over-represented in a gene list from differential expression or other analyses.
Approach: Test for enrichment using the hypergeometric test via ReactomePA enrichPathway against curated peer-reviewed pathways.
"Run pathway enrichment against Reactome" → Test whether genes in curated Reactome pathways are over-represented among significant genes.
library(ReactomePA) library(org.Hs.eg.db) pathway_result <- enrichPathway( gene = entrez_ids, # Character vector of Entrez IDs organism = 'human', # human, rat, mouse, celegans, yeast, zebrafish, fly pvalueCutoff = 0.05, pAdjustMethod = 'BH', readable = TRUE # Convert to gene symbols ) head(as.data.frame(pathway_result))
Prepare Gene List from DE Results
Goal: Extract significant Entrez gene IDs from differential expression results for Reactome enrichment.
Approach: Filter by significance and fold change, then convert symbols to Entrez IDs using bitr.
library(clusterProfiler) de_results <- read.csv('de_results.csv') sig_genes <- de_results[de_results$padj < 0.05 & abs(de_results$log2FoldChange) > 1, 'gene_symbol'] gene_ids <- bitr(sig_genes, fromType = 'SYMBOL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db) entrez_ids <- gene_ids$ENTREZID
GSEA on Reactome Pathways
Goal: Detect coordinated expression changes in Reactome pathways using all genes ranked by a statistic.
Approach: Create a sorted named vector from DE results and run gsePathway for rank-based enrichment.
# Create ranked gene list (named vector sorted by statistic) gene_list <- de_results$log2FoldChange names(gene_list) <- de_results$entrez_id gene_list <- sort(gene_list, decreasing = TRUE) gsea_result <- gsePathway( geneList = gene_list, organism = 'human', pvalueCutoff = 0.05, pAdjustMethod = 'BH', verbose = FALSE ) head(as.data.frame(gsea_result))
With Background Universe
Goal: Restrict enrichment testing to only genes that were actually measured in the experiment.
Approach: Pass all tested gene IDs as the universe parameter to enrichPathway.
all_genes <- de_results$entrez_id # All tested genes pathway_result <- enrichPathway( gene = entrez_ids, universe = all_genes, # Background gene set organism = 'human', pvalueCutoff = 0.05, readable = TRUE )
Visualization
Goal: Create publication-quality plots of Reactome enrichment results.
Approach: Use enrichplot functions (dotplot, barplot, emapplot, cnetplot, gseaplot2) on enrichment result objects.
library(enrichplot) # Dot plot dotplot(pathway_result, showCategory = 15) # Bar plot barplot(pathway_result, showCategory = 15) # Enrichment map (requires pairwise_termsim first) pathway_result <- pairwise_termsim(pathway_result) emapplot(pathway_result) # Gene-concept network cnetplot(pathway_result, categorySize = 'pvalue') # GSEA plot gseaplot2(gsea_result, geneSetID = 1:3)
View Pathway in Browser
# Open pathway in Reactome browser viewPathway('R-HSA-109582', organism = 'human') # Uses pathway ID # Get pathway ID from results top_pathway_id <- pathway_result@result$ID[1] viewPathway(top_pathway_id, organism = 'human')
Export Results
results_df <- as.data.frame(pathway_result) write.csv(results_df, 'reactome_enrichment.csv', row.names = FALSE) # Key columns: ID, Description, GeneRatio, BgRatio, pvalue, p.adjust, geneID, Count
Different Organisms
# Mouse pathway_mouse <- enrichPathway(gene = mouse_entrez, organism = 'mouse', readable = TRUE) # Rat pathway_rat <- enrichPathway(gene = rat_entrez, organism = 'rat', readable = TRUE) # Zebrafish pathway_zfish <- enrichPathway(gene = zfish_entrez, organism = 'zebrafish', readable = TRUE) # Supported: human, rat, mouse, celegans, yeast, zebrafish, fly
Compare Clusters
Goal: Compare Reactome pathway enrichment across multiple gene lists (e.g., upregulated vs downregulated).
Approach: Use compareCluster with enrichPathway to run enrichment per group and visualize side by side.
# Compare pathways across multiple gene lists gene_clusters <- list( upregulated = up_genes, downregulated = down_genes ) compare_result <- compareCluster( geneClusters = gene_clusters, fun = 'enrichPathway', organism = 'human', pvalueCutoff = 0.05 ) dotplot(compare_result)
Key Parameters
| Parameter | Default | Description |
|---|---|---|
| gene | required | Vector of Entrez IDs |
| organism | human | Species name |
| pvalueCutoff | 0.05 | P-value threshold |
| pAdjustMethod | BH | Adjustment method |
| universe | NULL | Background genes |
| minGSSize | 10 | Min genes per pathway |
| maxGSSize | 500 | Max genes per pathway |
| readable | FALSE | Convert to symbols |
Supported Organisms
| Organism | Name | OrgDb |
|---|---|---|
| Human | human | org.Hs.eg.db |
| Mouse | mouse | org.Mm.eg.db |
| Rat | rat | org.Rn.eg.db |
| Zebrafish | zebrafish | org.Dr.eg.db |
| Fly | fly | org.Dm.eg.db |
| C. elegans | celegans | org.Ce.eg.db |
| Yeast | yeast | org.Sc.sgd.db |
Interpretation Notes
- Reactome is very granular -- some pathways contain only 2-3 genes. Use
to filter these out.minGSSize = 10 - The deep hierarchy means parent pathways will often appear alongside child pathways. Look for the most specific (deepest) enriched pathway.
- Always specify a background universe (all tested genes) to avoid inflated significance.
- Examine fold enrichment (GeneRatio / BgRatio), not just p-values.
- For non-human species, note that annotations are orthology-inferred and may not capture species-specific pathway biology.
Related Skills
- go-enrichment - Gene Ontology enrichment
- kegg-pathways - KEGG pathway enrichment
- wikipathways - WikiPathways enrichment
- gsea - Gene Set Enrichment Analysis
- enrichment-visualization - Visualization functions