Claude-skill-registry hic-compartments-calling
This skill performs PCA-based A/B compartments calling on Hi-C .mcool datasets using pre-defined MCP tools from the cooler-tools, cooltools-tools, and plot-hic-tools servers.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/17-toolbased-hic-compartments-calling" ~/.claude/skills/majiayu000-claude-skill-registry-hic-compartments-calling && rm -rf "$T"
skills/data/17-toolbased-hic-compartments-calling/SKILL.mdHi-C Compartments Calling (MCP-based)
Overview
This skill provides an automated workflow for compartments calling on .mcool, .cool or .hic Hi-C data.
Main steps include:
- Refer to the Inputs & Outputs section to verify required files and output structure.
- Always prompt user for genome assembly used.
- Always prompt user for resolution used to call compartments. ~50-250 kb is recommended. 100 kb is default.
- Locate the genome FASTA file from homer genome fasta file based on user input.
- Rename chromosomes in the .mcool or .cool file to satisfy the chromosome format with "chr".
- Generate chromosome-arm view files for compartment calling after changing the chromosome name.
- Perform PCA-based compartment analysis and extract the first principal component (PC1).
- Generate compartment interaction saddle plots and BigWig outputs for visualization.
When to Use This Skill
Use this skill when:
- You want to identify A/B compartments from Hi-C
or.mcool
files..cool - You need PC1 compartment scores and bigWig tracks for genome browser visualization.
- You want a reproducible, normalized, automated compartment-calling workflow.
Inputs & Outputs
Inputs
- File format: .mcool, .cool, or .hic (Hi-C data file) data.
- Genome assembly: Prompt the user for genome assembly used.
- Resolution: Prompt the user for resolution used to call compartments. The default resolution is 100 kb.
Outputs
${sample}_Compartments_calling/ compartments/ eigs.${resolution}.cis.vecs.tsv # PC1 compartment scores eigs.${resolution}.bw eigs.${resolution}.cis.lam.txt saddle.cis.${resolution}.digitized.tsv saddle.cis.${resolution}.saddledump.npz plots/ # PC1 track for genome browser saddle.cis.${resolution}.pdf # Saddle plot visualization temp/ expected.${resolution}.cis.tsv view_${genome}.tsv # Chromosome-arm view definition bins.${res}.tsv gc.${res}.tsv
Allowed Tools
When using this skill, you should restrict yourself to the following MCP tools from server
cooler-tools, cooltools-tools, plot-hic-tools, project-init-tools, genome-locate-tools:
mcp__project-init-tools__project_initmcp__genome-locate-tools__genome_locate_fastamcp__HiCExplorer-tools__hic_to_mcoolmcp__cooler-tools__list_mcool_resolutionsmcp__cooler-tools__harmonize_chrom_namesmcp__cooler-tools__make_view_chromarmsmcp__cooler-tools__dump_bins_for_gcmcp__cooltools-tools__run_genome_gcmcp__cooltools-tools__run_expected_cismcp__cooltools-tools__run_eigs_cismcp__cooltools-tools__run_saddlemcp__plot-hic-tools__plot_saddle_pdf
Do NOT fall back to:
- raw shell commands (
,cooler dump
,cooltools eigs-cis
, etc.)cooltools saddle - ad-hoc Python snippets (e.g. importing
,cooler
,bioframe
manually in the reply).matplotlib
Decision Tree
Step 0 — Gather Required Information from the User
Before calling any tool, ask the user:
-
Sample name (
): used as prefix and for the output directorysample
.${sample}_Compartments_calling -
Genome assembly (
): e.g.genome
,hg38
,mm10
.danRer11- Never guess or auto-detect.
-
Hi-C matrix path/URI (
): e.g.mcool_uri
file path or.mcool
file path..hic
(.mcool file with resolution specified)path/to/sample.mcool::/resolutions/100000- or
file path.cool - or
file path.hic
-
Resolution (
): defaultresolution
(100 kb).100000- If user does not specify, use
as default.100000 - Must be the same as the resolution used for
${mcool_uri}
- If user does not specify, use
Step 1 — Initialize Project & Locate Genome FASTA
- Make director for this project:
Call:
mcp__project-init-tools__project_init
with:
: the user-provided sample namesample
: loop_callingtask
The tool will:
- Create
directory.${sample}_loop_calling - Return the full path of the
directory, which will be used as${sample}_loop_calling
.${proj_dir}
- If the user provides a
file, convert it to.hic
file using.mcool
tool:mcp__HiCExplorer-tools__hic_to_mcool
Call:
mcp__HiCExplorer-tools__hic_to_mcool
with:
: the user-provided path (e.g.input_hic
)input.hic
: the user-provided sample namesample
: directory to save the view file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_loop_calling
.mcp__project-init-tools__project_init
The tool will:
- Convert the
file to.hic
file..mcool - Return the path of the
file..mcool
If the conversion is successful, update
${mcool_uri} to the path of the .mcool file.
- Locate genome fasta file:
Call:
mcp__genome-locate-tools__genome_locate_fasta
with:
: the user-provided genome assemblygenome
The tool will:
- Locate genome FASTA.
- Verify the FASTA exists.
Step 2: List Available Resolutions in the .mcool file & Modify the Chromosome Names if Necessary
- Check the resolutions in
:mcool_uri
Call:
mcp__cooler-tools__list_mcool_resolutions
with:
: the user-provided path (e.g.mcool_path
) without resolution specified.input.mcool
The tool will:
- List all resolutions in the .mcool file.
- Return the resolutions as a list.
If the user defined or default
${resolution} is not found in the list, ask the user to specify the resolution again.
Else, use ${resolution} for the following steps.
- Check if the chromosome names in the .mcool file are started with "chr", and if not, modify them to start with "chr":
Call:
mcp__cooler-tools__harmonize_chrom_names
with:
: the user-provided sample namesample
: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
The tool will:
- Check if the chromosome names in the .mcool file.
- If not, harmonize the chromosome names in the .mcool file.
Step 3 — Create Chromosome-Arm View File
Use
bioframe to define chromosome arms based on centromeres:
Call:
mcp__cooler-tools__make_view_chromarms
with:
: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: genome assemblygenome
The tool will:
- Fetch chromsizes and centromeres via
.bioframe - Generate chromosomal arms and filter them to those present in the cooler.
- Return the path of the view file under
directory.${proj_dir}/temp/
Step 4 — Compute GC Track for Bins
- Dump bins for GC track:
Call:
with:mcp__cooler-tools__dump_bins_for_gc
: the user-provided sample namesample
: directory to save the GC track file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
The tool will:
- Dump bins at the specified resolution from the cooler.
- Return the path of the bins file under
directory.${proj_dir}/temp/
- Compute GC track:
Call:
mcp__cooltools-tools__run_genome_gc
with:
: the user-provided sample namesample
: directory to save the GC track file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: genome assemblygenome
The tool will:
- Compute GC content for each bin.
- Return the path of the GC track file under
directory.${proj_dir}/temp/
Step 5 — Run Expected-cis and Eigs-cis (PCA Compartment Calling)
- Calculate expected cis:
Call:
mcp__cooltools-tools__run_expected_cis
with:
: the user-provided sample namesample
: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: the path to the view file (e.g.view_path
)${proj_dir}/temp/view_${genome}.tsv
: the name of the weight column (default:clr_weight_name
)weight
: the number of diagonals to ignore based on resolutionignore_diags
The tool will:
- Generate expected cis file.
- Return the path of the expected cis file under
directory.${proj_dir}/temp/
- Calculate eigs cis:
Call:
mcp__cooltools-tools__run_eigs_cis
with:
: the user-provided sample namesample
: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: the view TSV from Step 3 (e.g.view_path
)view_${genome}.tsv
: GC track TSV from Step 4gc_tsv
: balancing column name (defaultclr_weight_name
, but can be set based on"weight"
if the user tells you the correct name)clr.bins().columns
: the number of principal components to compute (default 1)n_eigs
: whether to make bigwig file for PC1 track (default True)make_bigwig
This tool will:
- Run
to compute expected contact frequencies.cooltools expected-cis - Run
to perform PCA and extract PC1.cooltools eigs-cis - Return the path of the eigs-cis vecs file under
directory.${proj_dir}/compartments/ - Return the path of the bigWig file under
directory.${proj_dir}/compartments/
If the user reports an error about balancing weights:
- Ask the user which weight column should be used.
- Re-run
with the correctexpected_and_eigs
.clr_weight_name
Step 6 — Run Saddle Analysis
Call:
mcp__cooltools-tools__run_saddle
with:
: the user-provided sample namesample
: directory to save the saddle file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: the view TSV from Step 3 (e.g.view_path
)view_${genome}.tsv
: the eigs-cis vecs TSV from Step 5 (e.g.eigs_vecs_tsv
)compartments/eigs.${resolution}.cis.vecs.tsv
: the expected-cis TSV from Step 5 (e.g.expected_cis_tsv
)temp/expected_cis.${resolution}.tsv
: balancing column name (defaultclr_weight_name
, but can be set based on"weight"
if the user tells you the correct name)clr.bins().columns
andqrange_low
: defaultqrange_high
and0.020.98
The tool will:
- Run
.cooltools saddle - Generate saddle dump and related outputs, typically:
- Return the path of the saddle dump file under
directory.${proj_dir}/compartments/ - Return the path of the other related outputs under
directory.${proj_dir}/compartments/
Step 7 — Plot Saddle as PDF
Call:
mcp__plot-hic-tools__plot_saddle_pdf
with:
: the user-provided sample namesample
: directory to save the saddle file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: the user-provided chromosome name, e.g.chr_namechr1
This tool will:
- Load the corresponding
file..saddledump.npz - Plot the saddle matrix with
andLogNorm(1e-1, 1e1)
colormap.RdBu_r - Return the path of the compartment scores distribution PDF file under
directory.${proj_dir}/plots/ - Return the path of the saddle plot PDF file under
directory.${proj_dir}/plots/ - Return the path of the PC1 track PDF file under
directory.${proj_dir}/plots/
If the saddledump file is missing, inform the user to run
run_saddle first.
Best Practices
- Always confirm the genome and resolution explicitly with the user.
- Always use the defined MCP tools instead of ad-hoc code.
- If the user asks “how to run this manually”, you may conceptually describe the steps but still prefer to recommend using the MCP pipeline for reproducibility.
- If multiple resolutions are required, re-run the MCP tools with different
values and keep outputs in the sameresolution
directory, using resolution in filenames for disambiguation.${proj_dir}