BioSkills bio-workflows-smrna-pipeline

End-to-end small RNA-seq analysis from FASTQ to differential miRNA expression. Use when analyzing miRNA, piRNA, or other small RNA sequencing data.

install
source · Clone the upstream repo
git clone https://github.com/GPTomics/bioSkills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/GPTomics/bioSkills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/workflows/smrna-pipeline" ~/.claude/skills/gptomics-bioskills-bio-workflows-smrna-pipeline && rm -rf "$T"
manifest: workflows/smrna-pipeline/SKILL.md
source content

Version Compatibility

Reference examples tested with: DESeq2 1.42+, cutadapt 4.4+

Before using code patterns, verify installed versions match. If versions differ:

  • R:
    packageVersion('<pkg>')
    then
    ?function_name
    to verify parameters
  • CLI:
    <tool> --version
    then
    <tool> --help
    to confirm flags

If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.

Small RNA-seq Pipeline

"Analyze my small RNA-seq data from FASTQ to differential miRNAs" → Orchestrate adapter trimming (cutadapt), miRNA quantification (miRDeep2/miRge3), novel miRNA discovery, differential expression (DESeq2), and target prediction (miRanda).

Pipeline Overview

FASTQ → cutadapt trim → miRDeep2 → Quantification → DESeq2 → Target prediction

Step 1: Preprocessing

# Adapter trimming and size selection
cutadapt -a TGGAATTCTCGGGTGCCAAGG \
    --minimum-length 18 --maximum-length 30 \
    -o trimmed.fastq.gz reads.fastq.gz

Step 2: miRDeep2 Analysis

# Align to genome
mapper.pl trimmed.fastq.gz -e -h -i -j -l 18 \
    -m -p genome_index -s reads_collapsed.fa \
    -t reads_collapsed_vs_genome.arf

# miRNA quantification and novel prediction
miRDeep2.pl reads_collapsed.fa genome.fa \
    reads_collapsed_vs_genome.arf \
    mature_ref.fa none hairpin_ref.fa

Step 3: Differential Expression

library(DESeq2)
counts <- read.csv('mirna_counts.csv', row.names = 1)
dds <- DESeqDataSetFromMatrix(counts, colData, ~condition)
dds <- DESeq(dds)
results <- results(dds)

Step 4: Target Prediction

# miRanda for target prediction
miranda mature_mirnas.fa target_3utrs.fa -out targets.txt

QC Checkpoints

  1. After trimming: Size distribution should peak at 21-23nt
  2. After alignment: >70% mapping rate expected
  3. After DE: Check volcano plot and PCA

Related Skills

  • small-rna-seq/mirdeep2-analysis - Detailed miRDeep2
  • small-rna-seq/differential-mirna - DE analysis
  • small-rna-seq/target-prediction - Target analysis
bio-workflows-smrna-pipeline — OpenSkillIndex