Claude-skill-registry hic-matrix-qc
This skill performs standardized quality control (QC) on Hi-C contact matrices stored in .mcool or .cool format. It computes coverage and cis/trans ratios, distance-dependent contact decay (P(s) curves), coverage uniformity, and replicate correlation at a chosen resolution using cooler and cooltools. Use it to assess whether Hi-C data are of sufficient quality for downstream analyses such as TAD calling, loop detection, and compartment analysis.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/30-toolbased-hic-matrix-qc" ~/.claude/skills/majiayu000-claude-skill-registry-hic-matrix-qc && rm -rf "$T"
skills/data/30-toolbased-hic-matrix-qc/SKILL.mdHi-C Contact Matrix QC for .mcool Files
Overview
This skill performs QC on Hi-C matrices stored in .cool or .mcool format files at a user-selected resolution.
Main steps include:
- Refer to the Inputs & Outputs section to check required inputs and set up the output directory structure.
- Always wait the user feedback if required files are not available in the current working directory by asking
"${files} not available, provide required files or skip and proceed ?" - Inspect the
file to list available resolutions and confirm the analysis resolution with the user..mcool - Compute coverage and cis/trans ratios.
- Assess coverage uniformity across bins from coverage tables.
- Compute cis expected contact frequency and distance-dependent contact decay (P(s) curves).
- Visualize contact decay and P(s) scaling curves.
- If multiple Hi-C replicates are provided, compute pairwise correlation of balanced matrices at the chosen resolution.
- Summarize QC metrics and plots into a structured output directory.
When to use this skill
Use the hic-matrix-qc skill when you need to evaluate the quality of Hi-C contact matrices that are already stored in .cool, .mcool or .hic format.
Inputs & Outputs
Inputs
- File format: .mcool, .cool, or .hic (Hi-C data file).
- Genome assembly: Prompt the user for genome assembly used.
- Resolution: Choose the desired resolution for matrix QC. ~50-100 kb is recommended. Default is 100 kb.
Optional: Multiple Hi-C matrices for replicate QC
rep1.mcool rep2.mcool rep3.mcool
Outputs
${sample}_hic_matrix_qc/ logs/ hic_qc.log # Commands, parameters, and software versions metrics/ coverage.${resolution}.tsv # Per-bin cis/total coverage from cooltools coverage cis_trans_summary.${resolution}.txt # Summarized cis, total, trans counts, and ratios ps_scaling_summary.${resolution}.txt # Optional table with P(s) slope(s) in defined distance ranges replicate_correlation.${resolution}.tsv # Pairwise correlation coefficients between replicates plots/ coverage_histogram.${resolution}.pdf # Coverage uniformity plot ps_curve.${resolution}.pdf # P(s) curve (contact probability vs distance) decay_curve.${resolution}.pdf # Contact decay curve (raw/normalized) replicate_correlation_heatmap.${resolution}.pdf # Correlation matrix heatmap (if multiple replicates) comparison/ replicate_vectors_${resolution}.npz # (Optional) Stored vectors used for replicate correlations temp/ expected_cis.${resolution}.tsv # Expected cis contacts vs distance from expected-cis
Allowed Tools
When using this skill, you should restrict yourself to the following MCP tools from server
cooler-tools, cooltools-tools, plot-hic-tools, project-init-tools:
mcp__project-init-tools__project_initmcp__cooler-tools__compute_coverage_and_cis_transmcp__plot-hic-tools__plot_coverage_histogrammcp__cooltools-tools__run_expected_cismcp__plot-hic-tools__plot_ps_and_decaymcp__plot-hic-tools__replicate_correlation
Do NOT fall back to:
- raw shell commands (
,cooltools coverage
,cooltools expected-cis
, etc.)cooltools dots - ad-hoc Python snippets (e.g. importing
,cooler
,bioframe
manually in the reply).matplotlib
Decision Tree
Step 0 — Gather Required Information from the User
Before calling any tool, ask the user:
-
Sample name (
): used as prefix and for the output directorysample
.${sample}_hic_matrix_qc -
Genome assembly (
): e.g.genome
,hg38
,mm10
.danRer11- Never guess or auto-detect.
-
Hi-C matrix path/URI (
):mcool_uri
(.mcool file with resolution specified)path/to/sample.mcool::/resolutions/100000- or
file path.cool - or
file path.hic
-
Resolution (
): defaultresolution
(100 kb).100000- If user does not specify, use
as default.100000 - Must be the same as the resolution used for
${mcool_uri}
- If user does not specify, use
Step 1 — Initialize Project & Locate Genome FASTA
- Make director for this project:
Call:
mcp__project-init-tools__project_init
with:
: the user-provided sample namesample
: hic_matrix_qctask
The tool will:
- Create
directory.${sample}_hic_matrix_qc - Return the full path of the
directory, which will be used as${sample}_hic_matrix_qc
.${proj_dir}
- If the user provides a
file, convert it to.hic
file using.mcool
tool:mcp__HiCExplorer-tools__hic_to_mcool
Call:
mcp__HiCExplorer-tools__hic_to_mcool
with:
: the user-provided path (e.g.input_hic
)input.hic
: the user-provided sample namesample
: directory to save the view file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_hic_matrix_qc
.mcp__project-init-tools__project_init
The tool will:
- Convert the
file to.hic
file..mcool - Return the path of the
file..mcool
If the conversion is successful, update
${mcool_uri} to the path of the .mcool file.
- Locate genome fasta file:
Call:
mcp__genome-locate-tools__genome_locate_fasta
with:
: the user-provided genome assemblygenome
The tool will:
- Locate genome FASTA.
- Verify the FASTA exists.
Step 2: List Available Resolutions in the .mcool file & Modify the Chromosome Names if Necessary
- Check the resolutions in
:mcool_uri
Call:
mcp__cooler-tools__list_mcool_resolutions
with:
: the user-provided path (e.g.mcool_path
) without resolution specified.input.mcool
The tool will:
- List all resolutions in the .mcool file.
- Return the resolutions as a list.
If the user defined or default
${resolution} is not found in the list, ask the user to specify the resolution again.
Else, use ${resolution} for the following steps.
- Check if the chromosome names in the .mcool file are started with "chr", and if not, modify them to start with "chr":
Call:
mcp__cooler-tools__harmonize_chrom_names
with:
: the user-provided sample namesample
: directory to save the expected-cis and eigs-cis files. In this skill, it is the full path of theproj_dir
directory returned by${sample}_Compartments_callingmcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
The tool will:
- Check if the chromosome names in the .mcool file.
- If not, harmonize the chromosome names in the .mcool file.
- If the chromosome names are modified, return the path of the modified .mcool file under
directory${proj_dir}/
Step 3: Compute coverage and cis/trans ratio
- Quantify cis and total coverage and derive cis/trans ratio at the chosen resolution.
- If the cooler is unbalanced or has a different weight column name, ask the user for the correct weight name or whether to use raw counts (empty --clr_weight_name).
Call:
mcp__cooler-tools__compute_coverage_and_cis_trans
with:
: directory to save the view file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_hic_matrix_qc
.mcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: name of the weight column (default:clr_weight_name
)weight
: name of the cis column (default:cis_column
)cov_cis_weight
: name of the total column (default:total_column
)cov_tot_weight
The tool will:
- Compute coverage and cis/trans ratio at the chosen resolution.
- Output:
coverage.${resolution}.tsvcis_trans_summary.${resolution}.txt
Step 4: Assess coverage uniformity
Assess coverage uniformity by plotting a histogram of per-bin coverage.
Call:
mcp__plot-hic-tools__plot_coverage_histogram
with:
: the user-provided sample namesample
: directory to save the view file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_hic_matrix_qc
.mcp__project-init-tools__project_init
:resolution${resolution}
: which column to histogram, e.g.column
,cis
, ortotal
(default:n_valid
)cis
: number of histogram bins (default: 50)bins
The tool will:
- Draw the histogram of per-bin coverage.
- Return the path of the coverage histogram plot.
After the tool runs, inform user with this:
- A reasonably broad distribution is expected; a long tail is common.
- Many zero-coverage bins may indicate insufficient depth at this resolution.
- A few bins with extremely high coverage may indicate local artifacts (e.g. centromeres, rDNA, mapping issues).
Step 5: Compute cis expected and P(s) contact decay curve
- Use
to define chromosome arms based on centromeres:bioframe
Call:
mcp__cooler-tools__make_view_chromarms
with:
: the user-provided sample namesample
: directory to save the view file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_hic_matrix_qc
.mcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: the user-provided genome assemblygenome
The tool will:
- Fetch chromsizes and centromeres via
.bioframe - Generate chromosomal arms and filter them to those present in the cooler.
- Return the path of the view file under
directory.${proj_dir}/temp/
- Calculate expected cis:
Call:
mcp__cooltools-tools__run_expected_cis
with:
: the user-provided sample namesample
: directory to save the view file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_hic_matrix_qc
.mcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
: the path to the view file (e.g.view_path
)${proj_dir}/temp/view_${genome}.tsv
: the path to the expected cis file (e.g.expected_cis_tsv
)${proj_dir}/temp/expected_cis.${resolution}.tsv
: the name of the weight column (default:clr_weight_name
)weight
: the number of diagonals to ignore based on resolutionignore_diags
The tool will:
- Generate expected cis file.
- Return the path of the expected cis file.
- Plot the P(s) curve (log–log distance vs expected contacts) and decay curve (raw counts vs balanced P(s))
Call:
mcp__plot-hic-tools__plot_ps_and_decay
with:
: the user-provided sample namesample
: directory to save the view file. In this skill, it is the full path of theproj_dir
directory returned by${sample}_hic_matrix_qc
.mcp__project-init-tools__project_init
: cooler URI with resolution specified, e.g.mcool_uriinput.mcool::/resolutions/${resolution}
:resolution
must be the same as the resolution used for${resolution}
and must be an integer${mcool_uri}
The tool will:
- Plot the P(s) curve (log–log distance vs expected contacts)
- Plot the decay curve (raw counts vs balanced P(s))
- Return the path of the P(s) curve plot and decay curve plot.
Step 6: Replicate correlation of Hi-C matrices (optional)
- Quantify similarity between Hi-C replicates at matrix level.
- Assumes:
- At least two .mcool files (e.g. rep1.mcool, rep2.mcool, etc.).
- Same genome assembly and resolution.
Call:
mcp__plot-hic-tools__replicate_correlation
with:
: list of cooler URIs with resolution specified, one per replicate, e.g.mcool_uris['rep1.mcool::/resolutions/${resolution}', 'rep2.mcool::/resolutions/${resolution}']
: prefix for the output files, e.g.output_prefixhic_qc_replicates_${resolution}
: list of chromosomes to use for correlation, e.g.chroms['chr1', 'chr2']
: whether to use balanced matrices (default:use_balanced
)True
The tool will:
- Compute replicate correlation of Hi-C matrices.
- Return the path of the replicate correlation file.
After the tool runs, inform user with this:
- Very low correlations (<0.7) between supposed biological replicates may indicate experimental issues or mismatched samples.
Notes & troubleshooting
- If balancing weights are missing or correlation is calculated on raw counts, explicitly record this in logs and interpret with caution.