install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-phasing-imputation-reference-panels" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-phasing-imputation-reference-pan && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-phasing-imputation-reference-panels" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-phasing-imputation-reference-pan && rm -rf "$T"
manifest:
skills/bio-phasing-imputation-reference-panels/SKILL.mdsafety · automated scan (medium risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
- pip install
- downloads files (wget)
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content
<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA
-->
name: bio-phasing-imputation-reference-panels description: Download, prepare, and manage reference panels for phasing and imputation. Covers 1000 Genomes, HRC, and TOPMed panels. Use when setting up imputation infrastructure or selecting appropriate reference panels for target populations. tool_type: cli primary_tool: bcftools measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
Reference Panels
1000 Genomes Phase 3 (GRCh38)
# Download from IGSR BASE_URL="http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/working/20201028_3202_phased" for chr in {1..22}; do wget ${BASE_URL}/CCDG_14151_B01_GRM_WGS_2020-08-05_chr${chr}.filtered.shapeit2-duohmm-phased.vcf.gz wget ${BASE_URL}/CCDG_14151_B01_GRM_WGS_2020-08-05_chr${chr}.filtered.shapeit2-duohmm-phased.vcf.gz.tbi done
Subset by Population
# Download sample info wget http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000G_2504_high_coverage/1000G_2504_high_coverage.sequence.index # Create population sample lists grep "EUR" samples.txt | cut -f1 > european_samples.txt grep "AFR" samples.txt | cut -f1 > african_samples.txt grep "EAS" samples.txt | cut -f1 > east_asian_samples.txt # Subset reference to specific population bcftools view -S european_samples.txt \ 1000GP.chr22.vcf.gz \ -Oz -o 1000GP_EUR.chr22.vcf.gz
Convert to Beagle Format
# Beagle uses VCF directly, but ensure proper format bcftools view -m2 -M2 -v snps reference.vcf.gz | \ bcftools annotate --set-id '%CHROM:%POS:%REF:%ALT' | \ bgzip > reference_beagle.vcf.gz bcftools index reference_beagle.vcf.gz
Convert to IMPUTE5 Format
# IMPUTE5 uses its own format imp5Converter \ --h reference.vcf.gz \ --r chr22 \ --o reference.chr22.imp5
HRC Reference Panel
# HRC requires registration at EGA # After access granted: # Download from EGA using pyega3 pip install pyega3 pyega3 -cf credentials.json fetch EGAD00001002729 # HRC contains 32,470 samples (mostly European)
TOPMed Reference Panel
# TOPMed available through imputation servers # Or download from dbGaP with appropriate access # Use via Michigan Imputation Server: # 1. Upload study VCF # 2. Select "TOPMed r2" as reference # 3. Download imputed results
Genetic Maps
# Beagle format (GRCh38) - from Browning lab wget https://faculty.washington.edu/browning/beagle/genetic_maps/plink.GRCh38.map.zip unzip plink.GRCh38.map.zip -d genetic_maps/ # SHAPEIT5 format (recommended for SHAPEIT5) wget https://github.com/odelaneau/shapeit5/raw/main/maps/genetic_maps.b38.tar.gz tar xzf genetic_maps.b38.tar.gz
Check Reference Panel
# Basic stats bcftools stats reference.vcf.gz | head -50 # Sample count bcftools query -l reference.vcf.gz | wc -l # Variant count bcftools view -H reference.vcf.gz | wc -l # Check chromosomes bcftools index -s reference.vcf.gz
Lift Over Reference Panel
# GRCh37 to GRCh38 # Using Picard java -jar picard.jar LiftoverVcf \ I=reference_hg19.vcf.gz \ O=reference_hg38.vcf.gz \ CHAIN=hg19ToHg38.over.chain.gz \ REJECT=rejected.vcf \ R=hg38.fa # Or using CrossMap CrossMap.py vcf hg19ToHg38.chain reference_hg19.vcf hg38.fa reference_hg38.vcf
Align Study to Reference
# Check strand concordance bcftools +fixref study.vcf.gz -Oz -o study_fixed.vcf.gz -- \ -f reference.fa \ -i reference_panel.vcf.gz \ -m flip # Statistics on fixes bcftools +fixref study.vcf.gz -- -f reference.fa -m stats
Filter Reference Panel
# Remove singletons (appear in only 1 sample) bcftools view -c 2 reference.vcf.gz -Oz -o reference_no_singletons.vcf.gz # Filter by MAF bcftools view -q 0.001:minor reference.vcf.gz -Oz -o reference_maf001.vcf.gz # Remove indels (SNPs only) bcftools view -v snps reference.vcf.gz -Oz -o reference_snps.vcf.gz
Merge Custom Panel with 1000G
# If you have additional reference samples bcftools merge \ 1000GP.chr22.vcf.gz \ custom_reference.chr22.vcf.gz \ -Oz -o combined_reference.chr22.vcf.gz # Ensure matching variants first bcftools isec -n=2 \ 1000GP.chr22.vcf.gz \ custom_reference.chr22.vcf.gz \ -p isec_output
Reference Panel Comparison
| Panel | Samples | Variants | Populations |
|---|---|---|---|
| 1000G Phase 3 | 2,504 | 88M | 26 global |
| HRC r1.1 | 32,470 | 40M | European-heavy |
| TOPMed r2 | 97,256 | 308M | 60% European, diverse |
| UK10K | 3,781 | 42M | British |
Related Skills
- phasing-imputation/haplotype-phasing - Use panels for phasing
- phasing-imputation/genotype-imputation - Use panels for imputation
- variant-calling/vcf-manipulation - VCF file operations