Medical-research-skills neuropixels-analysis
End-to-end Neuropixels extracellular electrophysiology analysis (SpikeGLX/Open Ephys/NWB) including preprocessing, motion correction, Kilosort4 spike sorting, QC metrics, and Allen/IBL-style curation; use when processing Neuropixels recordings or when users mention Neuropixels, SpikeGLX, Open Ephys, Kilosort, quality metrics, drift/motion correction, or unit curation.
git clone https://github.com/aipoch/medical-research-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/Data Analysis/neuropixels-analysis" ~/.claude/skills/aipoch-medical-research-skills-neuropixels-analysis && rm -rf "$T"
scientific-skills/Data Analysis/neuropixels-analysis/SKILL.mdWhen to Use
Use this skill in any of the following situations:
- You need to load and standardize Neuropixels recordings from SpikeGLX (
), Open Ephys (.ap.bin/.lf.bin/.meta
), or NWB (.continuous/.oebin
) into a consistent analysis pipeline..nwb - You are preparing raw extracellular data for spike sorting, including high-pass filtering, phase shift correction (NP1.0), bad channel detection/removal, and common average referencing (CAR).
- You suspect probe drift or tissue motion and need to estimate and correct motion before sorting (especially when drift is > ~10 µm).
- You want to run spike sorting (Kilosort4 recommended; CPU alternatives supported) and then compute post-processing products (waveforms, templates, amplitudes, correlograms, unit locations).
- You need quality control and curation using Allen/IBL-style thresholds, plus optional AI-assisted visual review for borderline units, and exports to Phy/NWB.
Key Features
- Multi-format ingestion: SpikeGLX, Open Ephys, and NWB readers via SpikeInterface.
- Neuropixels-aware preprocessing:
- High-pass filtering for spike band
- Phase shift correction for Neuropixels 1.0
- Bad channel detection and removal
- Median CAR / referencing
- Motion/drift workflow:
- Motion estimation presets (e.g., “Kilosort-like”)
- Optional rigid/non-rigid correction presets
- Drift visualization outputs
- Spike sorting orchestration:
- Kilosort4 (GPU) recommended
- CPU alternatives (e.g., SpykingCircus2, Mountainsort5, Tridesclous2)
- Post-processing and QC:
- SortingAnalyzer-based computation of waveforms, templates, amplitudes, correlograms, unit locations, and quality metrics
- Curation:
- Allen/IBL-style automated labeling
- Optional AI-assisted visual analysis for uncertain units
- Reporting and export:
- HTML report generation
- Export to Phy and NWB
- Save metrics tables (CSV)
Reference guides (if present in the repository) can be used for deeper explanations:
reference/standard_workflow.mdreference/api_reference.mdreference/plotting_guide.md
,reference/PREPROCESSING.md
,reference/MOTION_CORRECTION.mdreference/SPIKE_SORTING.md
,reference/QUALITY_METRICS.md
,reference/AUTOMATED_CURATION.mdreference/AI_CURATION.md
Dependencies
Python dependencies (typical versions known to work; adjust to your environment):
python >= 3.9spikeinterface[full] >= 0.99probeinterface >= 0.2neo >= 0.13- Spike sorters (optional, depending on what you run):
(Kilosort4; GPU required)kilosort >= 4.0
(SpykingCircus2; CPU)spykingcircus >= 1.1
(CPU)mountainsort5 >= 0.5
- Optional (AI-assisted curation):
anthropic >= 0.20
- Optional (IBL tooling):
ibllib >= 2.0ibl-neuropixel >= 1.0
Example Usage
The following example is designed to be a complete, runnable script (assuming dependencies and a valid dataset path). It loads SpikeGLX data, preprocesses, estimates/corrects motion, runs Kilosort4, computes metrics, curates units, generates a report, and exports to Phy and NWB.
import spikeinterface.full as si import neuropixels_analysis as npa def main(): # Parallelization / chunking settings used by SpikeInterface functions job_kwargs = dict(n_jobs=-1, chunk_duration="1s", progress_bar=True) # 1) Load data (SpikeGLX example) # For Open Ephys: si.read_openephys("/path/to/Record_Node_101/") # For NWB: si.read_nwb("/path/to/file.nwb") recording = si.read_spikeglx("/path/to/spikeglx_folder", stream_id="imec0.ap") # Optional: slice first 60 seconds for a quick test fs = recording.get_sampling_frequency() recording = recording.frame_slice(0, int(60 * fs)) # 2) Preprocess (recommended chain; wrapper may include the same steps) # Note: phase_shift is mandatory for Neuropixels 1.0 and not needed for 2.0. rec = npa.preprocess(recording) # 3) Estimate drift/motion and correct if needed motion_info = npa.estimate_motion(rec, preset="kilosort_like", **job_kwargs) npa.plot_drift(rec, motion_info, output="drift_map.png") # Example threshold: correct if max drift exceeds 10 µm if float(motion_info["motion"].max()) > 10.0: rec = npa.correct_motion(rec, preset="nonrigid_accurate", **job_kwargs) # 4) Spike sorting (Kilosort4 recommended; requires GPU) sorting = si.run_sorter("kilosort4", rec, folder="ks4_output", **job_kwargs) # 5) Post-processing + metrics analyzer = si.create_sorting_analyzer(sorting, rec, sparse=True) analyzer.compute("random_spikes", max_spikes_per_unit=500, **job_kwargs) analyzer.compute("waveforms", ms_before=1.0, ms_after=2.0, **job_kwargs) analyzer.compute("templates", operators=["average", "std"], **job_kwargs) analyzer.compute("spike_amplitudes", **job_kwargs) analyzer.compute("correlograms", window_ms=50.0, bin_ms=1.0, **job_kwargs) analyzer.compute("unit_locations", method="monopolar_triangulation", **job_kwargs) analyzer.compute("quality_metrics", **job_kwargs) metrics = analyzer.get_extension("quality_metrics").get_data() metrics.to_csv("quality_metrics.csv") # 6) Automated curation (Allen/IBL-style) labels = npa.curate(metrics, method="allen") # e.g., "allen", "ibl", "strict" # 7) Report results = {"sorting": sorting, "metrics": metrics, "labels": labels, "analyzer": analyzer} npa.generate_analysis_report(results, "output_report/") npa.print_analysis_summary(results) # 8) Export si.export_to_phy( analyzer, output_folder="phy_export/", compute_pc_features=True, compute_amplitudes=True, ) from spikeinterface.exporters import export_to_nwb export_to_nwb(rec, sorting, "output.nwb") if __name__ == "__main__": main()
Implementation Details
1) Data I/O and supported formats
- SpikeGLX:
si.read_spikeglx(path, stream_id="imec0.ap") - Open Ephys:
si.read_openephys(path) - NWB:
si.read_nwb(path)
Neuropixels probe types commonly encountered:
- Neuropixels 1.0: requires phase shift correction to align channels.
- Neuropixels 2.0: denser geometries; phase shift correction typically not required.
2) Preprocessing chain (typical)
A standard spike-band preprocessing sequence is:
- High-pass filter (commonly 300–400 Hz) to isolate spikes.
- Phase shift correction (
) for NP1.0.si.phase_shift - Bad channel detection (
) and removal.si.detect_bad_channels - Common reference (often median CAR) to reduce shared noise.
Key parameters:
(high-pass cutoff): typical 300–400 Hzfreq_min- bad channel detection sensitivity (implementation-dependent; often exposed as thresholds/presets)
3) Motion estimation and correction
- Motion/drift can strongly degrade sorting quality; a practical rule is to inspect drift before sorting.
- Presets:
: faster estimation aligned with common sorter assumptionspreset="kilosort_like"
: more robust correction for severe driftpreset="nonrigid_accurate"
Operational threshold often used in practice:
- If estimated drift exceeds ~10 µm, apply correction before sorting.
4) Spike sorting
- Kilosort4 is recommended for Neuropixels due to speed and quality, but requires a GPU.
- CPU alternatives can be used when GPU is unavailable (at the cost of runtime and sometimes quality).
Sorter parameters to tune (Kilosort4 examples):
: samples per batch (often ~30000 by default)batch_size
: number of drift blocks (increase for long recordings)nblocks
: detection threshold (lower → more spikes, potentially more false positives)Th_learned
5) Post-processing and quality metrics
Using
SortingAnalyzer, the pipeline typically computes:
- waveforms (window:
,ms_before
)ms_after - templates (average/std)
- spike amplitudes
- correlograms (e.g.,
,window_ms=50
)bin_ms=1 - unit locations (e.g.,
)monopolar_triangulation - quality metrics (e.g., SNR, ISI violations, presence ratio, amplitude cutoff)
Common QC thresholds (dataset-dependent; document your choices):
: often 3–5snr_threshold
: often 0.01–0.5isi_violations_ratio
: often 0.5–0.95presence_ratio
6) Curation logic (Allen/IBL-style)
A conservative “good unit” selection often combines:
- high presence ratio (stable across recording)
- low ISI violations (refractory period respected)
- low amplitude cutoff (less truncation / missed spikes)
Example rule (illustrative):
presence_ratio > 0.9isi_violations_ratio < 0.5amplitude_cutoff < 0.1
7) AI-assisted visual analysis (optional)
For borderline units (e.g., moderate SNR), AI-assisted review can be used to interpret:
- waveform shape consistency
- refractory period evidence in autocorrelograms
- amplitude stability and drift effects
- multi-unit contamination indicators
If your repository provides
npa.analyze_unit_visually(...), it can be integrated with an API client (e.g., anthropic) to generate structured curation suggestions.