LLMs-Universal-Life-Science-and-Clinical-Skills- R_Programming
R Programming for Biomedical Science
install
source · Clone the upstream repo
git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Software_Engineering/R_Programming" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-r-programming && rm -rf "$T"
manifest:
Skills/Software_Engineering/R_Programming/SKILL.mdsource content
R Programming for Biomedical Science
Overview
R is a premier language for statistical computing and graphics, serving as the backbone for much of modern bioinformatics, particularly through the Bioconductor project. Mastery of R is essential for genomic data analysis, rigorous statistical testing, and publication-quality visualization.
Core Competencies
1. Base R & Functional Programming
- Data Structures: Vectors, Lists, Data Frames, Matrices, Factors.
- Control Flow: Apply family (
,lapply
,sapply
) vs. loops.tapply - S3/S4 Classes: Understanding R's object-oriented systems, critical for Bioconductor packages.
2. Tidyverse (Modern Data Science)
- dplyr: Data manipulation (filter, select, mutate, group_by, summarize).
- tidyr: Tidy data principles (pivot_longer, pivot_wider).
- readr: Efficient data import.
- purrr: Functional programming tools.
- magrittr: Pipe operators (
,%>%
).|>
3. Data Visualization
- ggplot2: Grammar of graphics, layering, themes, and scales.
- ComplexHeatmap: Advanced heatmaps for omics data.
- Patchwork/Cowplot: Composing multi-panel figures.
4. Bioconductor Ecosystem
- Core Classes:
,SummarizedExperiment
,SingleCellExperiment
.GRanges - Package Management:
.BiocManager - Workflow: Integration with downstream tools (DESeq2, edgeR, limma).
5. Statistics
- Hypothesis Testing: t-tests, ANOVA, linear models (
,lm
).glm - Dimensionality Reduction: PCA, t-SNE, UMAP.
- Correction: False Discovery Rate (FDR), Benjamini-Hochberg.
6. Package Development
- Structure: DESCRIPTION, NAMESPACE, R/ directory.
- Documentation: roxygen2.
- Testing: testthat.
- Check:
compliance.R CMD check
Learning Path
- Novice: Master
verbs anddplyr
basics. Understand factors.ggplot2 - Intermediate: Learn to use
classes (Bioconductor
). Perform differential expression analysis.SummarizedExperiment - Advanced: Build reusable R packages. Optimize code with
. Create interactive Shiny apps.Rcpp