Skills multi-omics-integration-strategist
Design multi-omics integration strategies for transcriptomics, proteomics,
install
source · Clone the upstream repo
git clone https://github.com/openclaw/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/aipoch-ai/multi-omics-integration-strategist" ~/.claude/skills/clawdbot-skills-multi-omics-integration-strategist && rm -rf "$T"
manifest:
skills/aipoch-ai/multi-omics-integration-strategist/SKILL.mdsource content
Skill: Multi-Omics Integration Strategist (ID: 204)
Overview
Designs multi-omics (transcriptomics RNA, proteomics Pro, metabolomics Met) joint analysis schemes, performs cross-validation at the pathway level, and provides systems biology-level integrated analysis strategies.
Use Cases
- Systems biology mechanism research for complex diseases
- Biomarker discovery and validation
- Drug target identification and pathway validation
- Multi-omics data quality assessment and consistency analysis
Directory Structure
. ├── SKILL.md # This file - Skill documentation ├── config/ │ └── pathways.json # Pathway database configuration ├── scripts/ │ └── main.py # Main analysis script ├── templates/ │ └── report_template.md # Analysis report template └── examples/ └── sample_data/ # Sample datasets
Input
Required Files
| File | Format | Description |
|---|---|---|
| CSV | Transcriptomics data: Gene ID, expression value, differential analysis results |
| CSV | Proteomics data: Protein ID, abundance value, differential analysis results |
| CSV | Metabolomics data: Metabolite ID, concentration value, differential analysis results |
Input Format Specifications
RNA Data (rna_data.csv)
gene_id,gene_name,log2fc,pvalue,padj,sample_A,sample_B,... ENSG00000139618,BRCA1,1.23,0.001,0.005,12.5,13.2,...
Protein Data (pro_data.csv)
protein_id,gene_name,log2fc,pvalue,padj,sample_A,sample_B,... P38398,BRCA1,0.85,0.002,0.008,2450,2890,...
Metabolite Data (met_data.csv)
metabolite_id,metabolite_name,kegg_id,log2fc,pvalue,padj,... C00187,Cholesterol,C00187,-1.45,0.003,0.012,...
Integration Strategy
1. ID Mapping Layer
- RNA → Protein: Mapping through Gene Symbol / UniProt ID
- Protein → Metabolite: Association through KEGG/Reactome enzyme-reaction-metabolite
- RNA → Metabolite: Indirect association through KEGG pathway
2. Pathway Mapping
Supported databases:
- KEGG (Kyoto Encyclopedia of Genes and Genomes)
- Reactome
- WikiPathways
- GO (Gene Ontology) - Biological Process
3. Cross-Validation Methods
3.1 Directional Consistency Validation
- Whether the change direction of genes/proteins/metabolites in the same pathway is consistent
- Score: +1 (consistent), -1 (opposite), 0 (no data)
3.2 Correlation Validation
- Pearson/Spearman correlation analysis
- Cross-omics expression profile clustering
3.3 Pathway Enrichment Concordance
- Independent enrichment analysis for each omics
- Common enriched pathway identification
3.4 Network Topology Validation
- Construct cross-omics regulatory network
- Identify key nodes (Hub genes/proteins/metabolites)
Output
1. Integration Report (integration_report.md
)
integration_report.md# Multi-Omics Integration Analysis Report ## Executive Summary - Sample count: RNA=30, Pro=28, Met=25 - Mapping success rate: RNA-Pro=85%, Pro-Met=62% - Pathway coverage: 342 KEGG pathways ## Cross-Validation Results ### Highly Consistent Pathways (Score > 0.8) 1. Glycolysis/Gluconeogenesis (Score=0.92) 2. Citrate cycle (TCA cycle) (Score=0.88) ### Conflicting Pathways (Score < -0.3) 1. Fatty acid biosynthesis (Score=-0.45) ## Recommendations - Focus on: Energy metabolism-related pathways - Needs verification: Lipid metabolism pathway data quality
2. External Visualization Tools (Not Included)
This tool generates analysis results that can be visualized using external tools. Users may export results to:
| Chart Type | Purpose | External Tool Required |
|---|---|---|
| Circos Plot | Cross-omics relationship panorama | matplotlib/circlize (user-installed) |
| Pathway Heatmap | Pathway-level changes | seaborn/complexheatmap (user-installed) |
| Sankey Diagram | Data flow mapping | plotly (user-installed) |
| Network Graph | Molecular interaction network | networkx/cytoscape (networkx is included) |
| Correlation Matrix | Cross-omics correlation | seaborn (user-installed) |
| Bubble Plot | Integrated enrichment analysis | ggplot2/plotly (user-installed) |
Note: This skill focuses on data integration and analysis. Visualization requires separate installation of plotting libraries by the user.
3. Output Files
| File | Description |
|---|---|
| ID mapping results |
| Pathway cross-validation scores |
| Cross-omics consistency matrix |
| Network edge list |
| Interactive HTML report |
Usage
Basic Usage
python scripts/main.py \ --rna rna_data.csv \ --pro pro_data.csv \ --met met_data.csv \ --output ./results
Advanced Options
python scripts/main.py \ --rna rna_data.csv \ --pro pro_data.csv \ --met met_data.csv \ --pathway-db KEGG,Reactome \ --id-mapping config/mapping.json \ --method correlation+enrichment+network \ --output ./results \ --format html,csv,json
Configuration
config/pathways.json
{ "databases": { "KEGG": { "enabled": true, "organism": "hsa", "min_genes": 3 }, "Reactome": { "enabled": true, "min_genes": 5 } }, "mapping": { "rna_to_protein": "gene_symbol", "protein_to_metabolite": "enzyme_commission" } }
Dependencies
- Python >= 3.8
- pandas >= 1.3.0
- numpy >= 1.21.0
- scipy >= 1.7.0
- scikit-learn >= 1.0.0
- networkx >= 2.6.0
- matplotlib >= 3.4.0
- seaborn >= 0.11.0
- gseapy >= 1.0.0 (Pathway enrichment analysis)
References
- Subramanian et al. (2005) PNAS - GSEA method
- Kamburov et al. (2011) NAR - ConsensusPathDB
- Chin et al. (2018) Nature Communications - Multi-omics integration methods review
Version
- Version: 1.0.0
- Last Updated: 2026-02-06
- Author: OpenClaw Bioinformatics Team
Risk Assessment
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python/R scripts executed locally | Medium |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Medium |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output files saved to workspace | Low |
Security Checklist
- No hardcoded credentials or API keys
- No unauthorized file system access (../)
- Output does not expose sensitive information
- Prompt injection protections in place
- Input file paths validated (no ../ traversal)
- Output directory restricted to workspace
- Script execution in sandboxed environment
- Error messages sanitized (no stack traces exposed)
- Dependencies audited
Prerequisites
# Python dependencies pip install -r requirements.txt
Evaluation Criteria
Success Metrics
- Successfully executes main functionality
- Output meets quality standards
- Handles edge cases gracefully
- Performance is acceptable
Test Cases
- Basic Functionality: Standard input → Expected output
- Edge Case: Invalid input → Graceful error handling
- Performance: Large dataset → Acceptable processing time
Lifecycle Status
- Current Stage: Draft
- Next Review Date: 2026-03-06
- Known Issues: None
- Planned Improvements:
- Performance optimization
- Additional feature support
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| str | Required | |
| str | Required | |
| str | Required | |
| str | './results' | |
| str | 'KEGG' | |
| str | Required | Create sample data for testing |
| str | 'md |