LLMs-Universal-Life-Science-and-Clinical-Skills- bulkrna-coexpression

install

source · Clone the upstream repo

git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Transcriptomics/bulkrna-coexpression" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-bulkrna-coexpressi && rm -rf "$T"

manifest: Skills/Transcriptomics/bulkrna-coexpression/SKILL.md

Bulk RNA-seq Co-expression Network Analysis

WGCNA-style weighted gene co-expression network analysis. Detects gene modules via soft thresholding, topological overlap, and hierarchical clustering, then identifies hub genes per module.

CLI Reference

python omicsclaw.py run bulkrna-coexpression --demo
python omicsclaw.py run bulkrna-coexpression --input <counts.csv> --output <dir>
python bulkrna_coexpression.py --input counts.csv --output results/
python bulkrna_coexpression.py --demo --output /tmp/coexpression_demo
python bulkrna_coexpression.py --input counts.csv --output results/ --power 6 --min-module-size 15

Why This Exists

Without it: Researchers must install the WGCNA R package, manually tune soft-thresholding power, interpret topological overlap matrices, and write custom scripts to extract hub genes from each module.
With it: A single Python command runs the full WGCNA-style pipeline — soft threshold selection, TOM-based module detection, hub gene extraction — and produces publication-ready figures and tables.
Why OmicsClaw: Implements the core WGCNA methodology in pure Python (numpy/scipy) with no R dependency, integrated into the OmicsClaw reporting framework with automatic scale-free topology fitting.

Workflow

Load: Read a genes-by-samples raw count matrix (CSV with a
```
gene
```
column and sample columns).
Transform: Log2-transform counts (log2(x + 1)) and filter low-variance genes (keep top 80% by variance).
Correlate: Compute Pearson correlation matrix across all retained genes.
Soft Threshold: Test a range of soft-thresholding powers and select the first power achieving scale-free topology fit (R^2 > 0.8).
Module Detection: Compute adjacency matrix, topological overlap matrix (TOM), and apply hierarchical clustering with tree cutting to identify co-expression modules.
Hub Genes: For each module, rank genes by intra-module connectivity and report the top hub genes.
Report: Write markdown report, result.json, module assignment and hub gene tables, and a reproducibility script.

Example Queries

"Run WGCNA on my bulk RNA-seq data"
"Find co-expression modules and hub genes"
"Detect gene co-expression networks from my count matrix"
"What soft threshold power should I use for my RNA-seq data?"
"Identify hub genes in co-expression modules"

Output Structure

output_directory/
├── report.md
├── result.json
├── figures/
│   ├── scale_free_fit.png
│   ├── module_sizes.png
│   └── module_dendrogram.png
├── tables/
│   ├── module_assignments.csv
│   ├── hub_genes.csv
│   └── threshold_fit.csv
└── reproducibility/
    └── commands.sh

Safety

Local-first: All processing runs locally; no data is uploaded to external services.
Disclaimer: Every report includes the standard OmicsClaw disclaimer.
Audit trail: Parameters, method details, and input checksums are recorded in result.json.

Integration with Orchestrator

Trigger conditions:

Automatically invoked when user intent matches co-expression, WGCNA, gene network, or hub gene keywords.

Chaining partners:

```
bulkrna-de
```
-- Upstream: differentially expressed genes can be used as input
```
bulkrna-enrichment
```
-- Downstream: pathway/GO enrichment of module gene sets
```
bulkrna-alignment
```
-- Upstream: count matrix generation from aligned reads

Parameters

Parameter	Default	Description
`--power`	auto	Soft-thresholding power (auto-selected if omitted)
`--min-module-size`	`10`	Minimum number of genes per module

Version Compatibility

Reference examples tested with: scipy 1.11+, pandas 2.0+, numpy 1.24+, matplotlib 3.7+

Dependencies

Required: numpy, pandas, scipy, matplotlib

Citations

WGCNA -- Langfelder & Horvath, BMC Bioinformatics 2008
Scale-free topology -- Zhang & Horvath, Statistical Applications in Genetics and Molecular Biology 2005
Topological Overlap Matrix -- Yip & Horvath, BMC Bioinformatics 2007

Related Skills

```
bulkrna-de
```
-- Differential expression analysis upstream
```
bulkrna-enrichment
```
-- Pathway enrichment of module gene sets downstream
```
bulkrna-alignment
```
-- Count matrix QC and generation upstream