OpenClaw-Medical-Skills tooluniverse-image-analysis

Production-ready microscopy image analysis and quantitative imaging data skill for colony morphometry, cell counting, fluorescence quantification, and statistical analysis of imaging-derived measurements. Processes ImageJ/CellProfiler output (area, circularity, intensity, cell counts), performs Dunnett's test, Cohen's d effect size, power analysis, Shapiro-Wilk normality tests, two-way ANOVA, polynomial regression, natural spline regression with confidence intervals, and comparative morphometry. Supports CSV/TSV measurement tables, multi-channel fluorescence data, colony swarming assays, and neuron counting datasets. Use when analyzing microscopy measurement data, colony area/circularity, cell count statistics, swarming assays, co-culture ratio optimization, or answering questions about imaging-derived quantitative data.

install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/tooluniverse-image-analysis" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-tooluniverse-image-analysis && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/tooluniverse-image-analysis" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-tooluniverse-image-analysis && rm -rf "$T"
manifest: skills/tooluniverse-image-analysis/SKILL.md
safety · automated scan (low risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
  • pip install
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content

Microscopy Image Analysis and Quantitative Imaging Data

Production-ready skill for analyzing microscopy-derived measurement data using pandas, numpy, scipy, statsmodels, and scikit-image. Designed for BixBench imaging questions covering colony morphometry, cell counting, fluorescence quantification, regression modeling, and statistical comparisons.

IMPORTANT: This skill handles complex multi-workflow analysis. Most implementation details have been moved to

references/
for progressive disclosure. This document focuses on high-level decision-making and workflow orchestration.


When to Use This Skill

Apply when users:

  • Have microscopy measurement data (area, circularity, intensity, cell counts) in CSV/TSV
  • Ask about colony morphometry (bacterial swarming, biofilm, growth assays)
  • Need statistical comparisons of imaging measurements (t-test, ANOVA, Dunnett's, Mann-Whitney)
  • Ask about cell counting statistics (NeuN, DAPI, marker counts)
  • Need effect size calculations (Cohen's d) and power analysis
  • Want regression models (polynomial, spline) fitted to dose-response or ratio data
  • Ask about model comparison (R-squared, F-statistic, AIC/BIC)
  • Need Shapiro-Wilk normality testing on imaging data
  • Want confidence intervals for peak predictions from fitted models
  • Questions mention imaging software output (ImageJ, CellProfiler, QuPath)
  • Need fluorescence intensity quantification or colocalization analysis
  • Ask about image segmentation results (counts, areas, shapes)

BixBench Coverage: 21 questions across 4 projects (bix-18, bix-19, bix-41, bix-54)

NOT for (use other skills instead):

  • Phylogenetic analysis → Use
    tooluniverse-phylogenetics
  • RNA-seq differential expression → Use
    tooluniverse-rnaseq-deseq2
  • Single-cell scRNA-seq → Use
    tooluniverse-single-cell
  • Statistical regression only (no imaging context) → Use
    tooluniverse-statistical-modeling

Core Principles

  1. Data-first approach - Load and inspect all CSV/TSV measurement data before analysis
  2. Question-driven - Parse the exact statistic, comparison, or model requested
  3. Statistical rigor - Proper effect sizes, multiple comparison corrections, model selection
  4. Imaging-aware - Understand ImageJ/CellProfiler measurement columns (Area, Circularity, Round, Intensity)
  5. Workflow flexibility - Support both pre-quantified data (CSV) and raw image processing
  6. Precision - Match expected answer format (integer, range, decimal places)
  7. Reproducible - Use standard Python/scipy equivalents to R functions

Required Python Packages

# Core (MUST be installed)
import pandas as pd
import numpy as np
from scipy import stats
from scipy.interpolate import BSpline, make_interp_spline
import statsmodels.api as sm
from statsmodels.formula.api import ols
from statsmodels.stats.power import TTestIndPower
from patsy import dmatrix, bs, cr

# Optional (for raw image processing)
import skimage
import cv2
import tifffile

Installation:

pip install pandas numpy scipy statsmodels patsy scikit-image opencv-python-headless tifffile

High-Level Workflow Decision Tree

START: User question about microscopy data
│
├─ Q1: What type of data is available?
│  │
│  ├─ PRE-QUANTIFIED DATA (CSV/TSV with measurements)
│  │  └─ Workflow: Load → Parse question → Statistical analysis
│  │     Pattern: Most common BixBench pattern (bix-18, bix-19, bix-41, bix-54)
│  │     See: Section "Quantitative Data Analysis" below
│  │
│  └─ RAW IMAGES (TIFF, PNG, multi-channel)
│     └─ Workflow: Load → Segment → Measure → Analyze
│        See: references/image_processing.md
│
├─ Q2: What type of analysis is needed?
│  │
│  ├─ STATISTICAL COMPARISON
│  │  ├─ Two groups → t-test or Mann-Whitney
│  │  ├─ Multiple groups → ANOVA or Dunnett's test
│  │  ├─ Two factors → Two-way ANOVA
│  │  └─ Effect size → Cohen's d, power analysis
│  │  See: references/statistical_analysis.md
│  │
│  ├─ REGRESSION MODELING
│  │  ├─ Dose-response → Polynomial (quadratic, cubic)
│  │  ├─ Ratio optimization → Natural spline
│  │  └─ Model comparison → R-squared, F-statistic, AIC/BIC
│  │  See: references/statistical_analysis.md
│  │
│  ├─ CELL COUNTING
│  │  ├─ Fluorescence (DAPI, NeuN) → Threshold + watershed
│  │  ├─ Brightfield → Adaptive threshold
│  │  └─ High-density → CellPose or StarDist (external)
│  │  See: references/cell_counting.md
│  │
│  ├─ COLONY SEGMENTATION
│  │  ├─ Swarming assays → Otsu threshold + morphology
│  │  ├─ Biofilms → Li threshold + fill holes
│  │  └─ Growth assays → Time-lapse tracking
│  │  See: references/segmentation.md
│  │
│  └─ FLUORESCENCE QUANTIFICATION
│     ├─ Intensity measurement → regionprops
│     ├─ Colocalization → Pearson/Manders
│     └─ Multi-channel → Channel-wise quantification
│     See: references/fluorescence_analysis.md
│
└─ Q3: When to use scikit-image vs OpenCV?
   ├─ scikit-image: Scientific analysis, measurements, regionprops
   ├─ OpenCV: Fast processing, real-time, large batches
   └─ Both: Often interchangeable for basic operations
   See: references/image_processing.md "Library Selection Guide"

Quantitative Data Analysis Workflow

Phase 0: Question Parsing and Data Discovery

CRITICAL FIRST STEP: Before writing ANY code, identify what data files are available and what the question is asking for.

import os, glob, pandas as pd

# Discover data files
data_dir = "."
csv_files = glob.glob(os.path.join(data_dir, '**', '*.csv'), recursive=True)
tsv_files = glob.glob(os.path.join(data_dir, '**', '*.tsv'), recursive=True)
img_files = glob.glob(os.path.join(data_dir, '**', '*.tif*'), recursive=True)

# Load and inspect first measurement file
if csv_files:
    df = pd.read_csv(csv_files[0])
    print(f"Shape: {df.shape}")
    print(f"Columns: {list(df.columns)}")
    print(df.head())
    print(df.describe())

Common Column Names:

  • Area: Colony or cell area in pixels or calibrated units
  • Circularity: 4piarea/perimeter^2, range [0,1], 1.0 = perfect circle
  • Round: Roundness = 4area/(pimajor_axis^2)
  • Genotype/Strain: Biological grouping variable
  • Ratio: Co-culture mixing ratio (e.g., "1:3", "5:1")
  • NeuN/DAPI/GFP: Cell marker counts or intensities

Phase 1: Grouped Statistics

def grouped_summary(df, group_cols, measure_col):
    """Calculate summary statistics by group."""
    summary = df.groupby(group_cols)[measure_col].agg(
        Mean='mean',
        SD='std',
        Median='median',
        Min='min',
        Max='max',
        N='count'
    ).reset_index()
    summary['SEM'] = summary['SD'] / np.sqrt(summary['N'])
    return summary

# Example: Colony morphometry by genotype
area_summary = grouped_summary(df, 'Genotype', 'Area')
circ_summary = grouped_summary(df, 'Genotype', 'Circularity')

For detailed statistical functions, see: references/statistical_analysis.md

Phase 2: Statistical Testing

Decision guide:

  • Normality test needed? → Shapiro-Wilk
  • Two groups comparison? → t-test or Mann-Whitney
  • Multiple groups vs control? → Dunnett's test
  • Multiple groups, all comparisons? → Tukey HSD
  • Two factors? → Two-way ANOVA
  • Effect size? → Cohen's d
  • Sample size planning? → Power analysis

See: references/statistical_analysis.md for complete implementations

Phase 3: Regression Modeling

When to use each model:

  • Polynomial (quadratic/cubic): Smooth dose-response, clear peak
  • Natural spline: Flexible, non-parametric, handles complex patterns
  • Linear: Simple relationships, checking for trends

Model comparison metrics:

  • R-squared: Overall fit (higher = better)
  • Adjusted R-squared: Penalizes complexity
  • F-statistic p-value: Model significance
  • AIC/BIC: Compare non-nested models

See: references/statistical_analysis.md for complete implementations


Raw Image Processing Workflow

When Processing Raw Images

Workflow: Load → Preprocess → Segment → Measure → Export

# Quick start for cell counting
from scripts.segment_cells import count_cells_in_image

result = count_cells_in_image(
    image_path="cells.tif",
    channel=0,  # DAPI channel
    min_area=50
)
print(f"Found {result['count']} cells")

Segmentation Method Selection

Decision guide:

Cell TypeDensityBest MethodNotes
Nuclei (DAPI)Low-MediumOtsu + watershedStandard approach
Nuclei (DAPI)HighCellPose/StarDistHandles touching
ColoniesWell-separatedOtsu thresholdFast, reliable
ColoniesTouchingWatershedEdge detection
Cells (phase)AnyAdaptive thresholdHandles uneven illumination
FluorescenceLow signalLi thresholdMore sensitive

See: references/segmentation.md and references/cell_counting.md for detailed protocols

Library Selection: scikit-image vs OpenCV

Use scikit-image when:

  • Scientific measurements needed (area, perimeter, intensity)
  • regionprops for object properties
  • Publication-quality analysis
  • Easier syntax for scientists

Use OpenCV when:

  • Processing large image batches
  • Speed is critical
  • Real-time processing
  • Advanced computer vision features

Both work for:

  • Thresholding, filtering, morphological operations
  • Basic image transformations
  • Most segmentation tasks

See: references/image_processing.md "Library Selection Guide"


Common BixBench Patterns

Pattern 1: Colony Morphometry (bix-18)

Question type: "Mean circularity of genotype with largest area?"

Data: CSV with Genotype, Area, Circularity columns

Workflow:

  1. Load CSV → group by Genotype
  2. Calculate mean Area per genotype
  3. Identify genotype with max mean Area
  4. Report mean Circularity for that genotype

See: references/segmentation.md "Colony Morphometry Analysis"

Pattern 2: Cell Counting Statistics (bix-19)

Question type: "Cohen's d for NeuN counts between conditions?"

Data: CSV with Condition, NeuN_count, Sex, Hemisphere columns

Workflow:

  1. Load CSV → filter by hemisphere/sex if needed
  2. Split by Condition (KD vs CTRL)
  3. Calculate Cohen's d with pooled SD
  4. Report effect size

See: references/statistical_analysis.md "Effect Size Calculations"

Pattern 3: Multi-Group Comparison (bix-41)

Question type: "Dunnett's test: How many ratios equivalent to control?"

Data: CSV with multiple co-culture ratios, Area, Circularity

Workflow:

  1. Create Strain_Ratio labels
  2. Run Dunnett's test for Area (vs control)
  3. Run Dunnett's test for Circularity (vs control)
  4. Count groups NOT significant in BOTH tests

See: references/statistical_analysis.md "Dunnett's Test"

Pattern 4: Regression Optimization (bix-54)

Question type: "Peak frequency from natural spline model?"

Data: CSV with co-culture frequencies and Area measurements

Workflow:

  1. Convert ratio strings to frequencies
  2. Fit natural spline model (df=4)
  3. Find peak via grid search
  4. Report peak frequency + confidence interval

See: references/statistical_analysis.md "Regression Modeling"


Quick Reference Table

TaskPrimary ToolReference
Load measurement CSVpandas.read_csv()This file
Group statisticsdf.groupby().agg()This file
T-testscipy.stats.ttest_ind()statistical_analysis.md
ANOVAstatsmodels.ols + anova_lm()statistical_analysis.md
Dunnett's testscipy.stats.dunnett()statistical_analysis.md
Cohen's dCustom function (pooled SD)statistical_analysis.md
Power analysisstatsmodels TTestIndPowerstatistical_analysis.md
Polynomial regressionstatsmodels.OLS + poly featuresstatistical_analysis.md
Natural splinepatsy.cr() + statsmodels.OLSstatistical_analysis.md
Cell segmentationskimage.filters + watershedcell_counting.md
Colony segmentationskimage.filters.threshold_otsusegmentation.md
Fluorescence quantificationskimage.measure.regionpropsfluorescence_analysis.md
ColocalizationPearson/Mandersfluorescence_analysis.md
Image loadingtifffile, skimage.ioimage_processing.md
Batch processingscripts/batch_process.pyscripts/

Example Scripts

Ready-to-use scripts in

scripts/
directory:

  1. segment_cells.py - Cell/nuclei counting with watershed
  2. measure_fluorescence.py - Multi-channel intensity quantification
  3. batch_process.py - Process folders of images
  4. colony_morphometry.py - Measure colony area/circularity
  5. statistical_comparison.py - Group comparison statistics

Usage:

# Count cells in image
python scripts/segment_cells.py cells.tif --channel 0 --min-area 50

# Batch process folder
python scripts/batch_process.py input_folder/ output.csv --analysis cell_count

Detailed Reference Guides

For complete implementations and protocols:

  1. references/statistical_analysis.md - All statistical tests, regression models
  2. references/cell_counting.md - Cell/nuclei counting protocols
  3. references/segmentation.md - Colony and object segmentation
  4. references/fluorescence_analysis.md - Intensity quantification, colocalization
  5. references/image_processing.md - Image loading, preprocessing, library selection
  6. references/troubleshooting.md - Common issues and solutions

Important Notes

Matching R Statistical Functions

Some BixBench questions use R for analysis. Python equivalents:

  • R's Dunnett test (
    multcomp::glht
    ) →
    scipy.stats.dunnett()
    (scipy ≥ 1.10)
  • R's natural spline (
    ns(x, df=4)
    ) →
    patsy.cr(x, knots=...)
    with explicit quantile knots
  • R's t-test (
    t.test()
    ) →
    scipy.stats.ttest_ind()
  • R's ANOVA (
    aov()
    ) →
    statsmodels.formula.api.ols()
    +
    sm.stats.anova_lm()

See: references/statistical_analysis.md for exact parameter matching

Answer Formatting

BixBench expects specific formats:

  • "to the nearest thousand":
    int(round(val, -3))
  • Percentages: Usually integer or 1-2 decimal places
  • Cohen's d: 3 decimal places
  • Sample sizes: Always integer (ceiling)
  • Ratios: String format "5:1"

Completeness Checklist

Before returning your answer, verify:

  • Loaded all data files and inspected column names
  • Identified the specific statistic or model requested
  • Used correct grouping variables and filter conditions
  • Applied correct rounding or format
  • For "how many" questions: counted correctly based on criteria
  • For statistical tests: used appropriate multiple comparison correction
  • For regression: properly prepared and transformed data
  • Double-checked direction of comparisons
  • Verified answer falls within expected range

Getting Help

  • Start with decision tree at top of this file
  • Check relevant reference guide for detailed protocol
  • Use example scripts as templates
  • See troubleshooting guide for common issues
  • All statistical implementations in statistical_analysis.md