install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-read-alignment-star-alignment" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-read-alignment-star-alignment && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-read-alignment-star-alignment" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-read-alignment-star-alignment && rm -rf "$T"
manifest:
skills/bio-read-alignment-star-alignment/SKILL.mdsource content
<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA
-->
name: bio-read-alignment-star-alignment description: Align RNA-seq reads with STAR (Spliced Transcripts Alignment to a Reference). Supports two-pass mode for novel splice junction discovery. Use when aligning RNA-seq data requiring splice-aware alignment. tool_type: cli primary_tool: STAR measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
STAR RNA-seq Alignment
Generate Genome Index
# Basic index generation STAR --runMode genomeGenerate \ --runThreadN 8 \ --genomeDir star_index/ \ --genomeFastaFiles reference.fa \ --sjdbGTFfile annotation.gtf \ --sjdbOverhang 100 # Read length - 1
Index with Specific Read Length
# For 150bp reads, use sjdbOverhang=149 STAR --runMode genomeGenerate \ --runThreadN 8 \ --genomeDir star_index_150/ \ --genomeFastaFiles reference.fa \ --sjdbGTFfile annotation.gtf \ --sjdbOverhang 149
Basic Alignment
# Paired-end alignment STAR --runThreadN 8 \ --genomeDir star_index/ \ --readFilesIn reads_1.fq.gz reads_2.fq.gz \ --readFilesCommand zcat \ --outFileNamePrefix sample_ \ --outSAMtype BAM SortedByCoordinate
Single-End Alignment
STAR --runThreadN 8 \ --genomeDir star_index/ \ --readFilesIn reads.fq.gz \ --readFilesCommand zcat \ --outFileNamePrefix sample_ \ --outSAMtype BAM SortedByCoordinate
Two-Pass Mode
# Two-pass mode for better novel junction detection STAR --runThreadN 8 \ --genomeDir star_index/ \ --readFilesIn r1.fq.gz r2.fq.gz \ --readFilesCommand zcat \ --outFileNamePrefix sample_ \ --outSAMtype BAM SortedByCoordinate \ --twopassMode Basic
Quantification Mode
# Output gene counts (like featureCounts) STAR --runThreadN 8 \ --genomeDir star_index/ \ --readFilesIn r1.fq.gz r2.fq.gz \ --readFilesCommand zcat \ --outFileNamePrefix sample_ \ --outSAMtype BAM SortedByCoordinate \ --quantMode GeneCounts
Output:
sample_ReadsPerGene.out.tab with columns:
- Gene ID
- Unstranded counts
- Forward strand counts
- Reverse strand counts
ENCODE Options
# ENCODE recommended settings STAR --runThreadN 8 \ --genomeDir star_index/ \ --readFilesIn r1.fq.gz r2.fq.gz \ --readFilesCommand zcat \ --outFileNamePrefix sample_ \ --outSAMtype BAM SortedByCoordinate \ --outSAMunmapped Within \ --outSAMattributes NH HI AS NM MD \ --outFilterType BySJout \ --outFilterMultimapNmax 20 \ --outFilterMismatchNmax 999 \ --outFilterMismatchNoverReadLmax 0.04 \ --alignIntronMin 20 \ --alignIntronMax 1000000 \ --alignMatesGapMax 1000000 \ --alignSJoverhangMin 8 \ --alignSJDBoverhangMin 1
Fusion Detection
# For chimeric/fusion detection STAR --runThreadN 8 \ --genomeDir star_index/ \ --readFilesIn r1.fq.gz r2.fq.gz \ --readFilesCommand zcat \ --outFileNamePrefix sample_ \ --outSAMtype BAM SortedByCoordinate \ --chimSegmentMin 12 \ --chimJunctionOverhangMin 8 \ --chimOutType Junctions WithinBAM SoftClip \ --chimMainSegmentMultNmax 1
Output Files
| File | Description |
|---|---|
| *Aligned.sortedByCoord.out.bam | Sorted BAM file |
| *Log.final.out | Alignment summary statistics |
| *Log.out | Detailed log |
| *SJ.out.tab | Splice junctions |
| *ReadsPerGene.out.tab | Gene counts (if --quantMode) |
| *Chimeric.out.junction | Fusion candidates (if chimeric) |
Memory Requirements
# Reduce memory for limited systems STAR --genomeLoad NoSharedMemory \ --limitBAMsortRAM 10000000000 \ # 10GB for sorting ... # For very large genomes, limit during index generation STAR --runMode genomeGenerate \ --limitGenomeGenerateRAM 31000000000 \ # 31GB ...
Shared Memory Mode
# Load genome into shared memory (for multiple samples) STAR --genomeLoad LoadAndExit --genomeDir star_index/ # Run alignments (faster startup) STAR --genomeLoad LoadAndKeep --genomeDir star_index/ ... # Remove from memory when done STAR --genomeLoad Remove --genomeDir star_index/
Key Parameters
| Parameter | Default | Description |
|---|---|---|
| --runThreadN | 1 | Number of threads |
| --sjdbOverhang | 100 | Read length - 1 |
| --outFilterMultimapNmax | 10 | Max multi-mapping |
| --alignIntronMax | 0 | Max intron size |
| --outFilterMismatchNmax | 10 | Max mismatches |
| --outSAMtype | SAM | Output format |
| --quantMode | - | GeneCounts for counting |
| --twopassMode | None | Basic for two-pass |
Related Skills
- rna-quantification/featurecounts-counting - Alternative counting
- rna-quantification/alignment-free-quant - Salmon/kallisto alternative
- differential-expression/deseq2-basics - Downstream DE analysis
- read-qc/fastp-workflow - Preprocess reads