OpenClaw-Medical-Skills bio-proteomics-quantification
Protein quantification from mass spectrometry data including label-free (LFQ, intensity-based), isobaric labeling (TMT, iTRAQ), and metabolic labeling (SILAC) approaches. Use when extracting protein abundances from MS data for differential analysis.
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-proteomics-quantification" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-proteomics-quantification && rm -rf "$T"
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-proteomics-quantification" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-proteomics-quantification && rm -rf "$T"
skills/bio-proteomics-quantification/SKILL.mdVersion Compatibility
Reference examples tested with: MSnbase 2.28+, numpy 1.26+, pandas 2.2+
Before using code patterns, verify installed versions match. If versions differ:
- Python:
thenpip show <package>
to check signatureshelp(module.function) - R:
thenpackageVersion('<pkg>')
to verify parameters?function_name
If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Protein Quantification
"Quantify proteins from my mass spec data" → Extract protein abundances from MS data using label-free (LFQ, spectral counting), isobaric labeling (TMT, iTRAQ), or metabolic labeling (SILAC) approaches.
- R:
for feature-to-protein summarizationMSstats::dataProcess() - Python:
for MaxLFQ-style normalization and ratio calculationpandas - R:
for isobaric tag reporter ion extractionMSnbase
Label-Free Quantification (LFQ)
Intensity-Based (MaxLFQ Algorithm)
import pandas as pd import numpy as np def maxlfq_normalize(intensities): '''Simplified MaxLFQ normalization''' log_int = np.log2(intensities.replace(0, np.nan)) # Median centering per sample sample_medians = log_int.median(axis=0) global_median = sample_medians.median() normalized = log_int - sample_medians + global_median return normalized
Spectral Counting
def spectral_count_normalize(counts, total_spectra): '''Normalized spectral abundance factor (NSAF)''' # Divide by protein length, then by total nsaf = counts / total_spectra return nsaf / nsaf.sum()
TMT/iTRAQ Quantification
library(MSnbase) # Load reporter ion data tmt_data <- readMSnSet('tmt_data.txt') # Normalize with reference channel tmt_normalized <- normalize(tmt_data, method = 'center.median') # Summarize to protein level protein_data <- combineFeatures(tmt_normalized, groupBy = fData(tmt_data)$protein, fun = 'median')
Python TMT Processing
def extract_tmt_intensities(spectrum, reporter_mz, tolerance=0.003): '''Extract TMT reporter ion intensities''' mz, intensity = spectrum.get_peaks() tmt_intensities = {} for channel, target_mz in reporter_mz.items(): mask = np.abs(mz - target_mz) < tolerance if mask.any(): tmt_intensities[channel] = intensity[mask].max() else: tmt_intensities[channel] = 0 return tmt_intensities TMT_10PLEX = {'126': 126.127726, '127N': 127.124761, '127C': 127.131081, '128N': 128.128116, '128C': 128.134436, '129N': 129.131471, '129C': 129.137790, '130N': 130.134825, '130C': 130.141145, '131': 131.138180}
SILAC Quantification
def calculate_silac_ratio(heavy_intensity, light_intensity): '''Calculate SILAC H/L ratio''' if light_intensity > 0 and heavy_intensity > 0: return np.log2(heavy_intensity / light_intensity) return np.nan # Typical mass shifts SILAC_SHIFTS = { 'Arg10': 10.008269, # 13C6 15N4 Arginine 'Lys8': 8.014199, # 13C6 15N2 Lysine 'Arg6': 6.020129, # 13C6 Arginine 'Lys6': 6.020129 # 13C6 Lysine }
MSstats Workflow (R)
Goal: Convert MaxQuant output into normalized protein-level abundance estimates using MSstats feature-to-protein summarization.
Approach: Reformat MaxQuant evidence and proteinGroups files into MSstats input format, then apply median equalization normalization with Tukey's median polish for protein-level summarization.
library(MSstats) # Prepare input from MaxQuant maxquant_input <- MaxQtoMSstatsFormat( evidence = read.table('evidence.txt', sep = '\t', header = TRUE), proteinGroups = read.table('proteinGroups.txt', sep = '\t', header = TRUE), annotation = read.csv('annotation.csv') ) # Process and normalize processed <- dataProcess(maxquant_input, normalization = 'equalizeMedians', summaryMethod = 'TMP', censoredInt = 'NA') # Protein-level summary protein_summary <- quantification(processed)
Related Skills
- data-import - Load MS data before quantification
- differential-abundance - Statistical testing after quantification
- expression-matrix/counts-ingest - Similar matrix handling