Claude-skill-registry-data mithril-checkpoint-agent
Build mithril-checkpoint compression for PyTorch models. Use when implementing byte grouping, compression pipeline, or checkpoint I/O.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry-data
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/mithril-checkpoint-agent" ~/.claude/skills/majiayu000-claude-skill-registry-data-mithril-checkpoint-agent && rm -rf "$T"
manifest:
data/mithril-checkpoint-agent/SKILL.mdsource content
Mithril Checkpoint Agent
Build checkpoint compression for PyTorch models with 10-20x lossless compression.
Status
Read
crates/mithril-checkpoint/STATUS.md for current progress.
Reference Documentation
- Full product specificationSPEC.md
- Detailed implementation spec (if exists)checkpoint/SPEC.md
- Papers and prior art (LMC, ZipNN, Check-N-Run)RESEARCH.md
Module Responsibilities
bytegroup
bfloat16 byte grouping for better compression:
/// Group bf16 bytes: [h0,l0,h1,l1,...] → [h0,h1,...,l0,l1,...] pub fn byte_group_bf16(data: &[u8]) -> Vec<u8> { let n = data.len() / 2; let mut grouped = Vec::with_capacity(data.len()); for i in 0..n { grouped.push(data[i * 2]); } // high bytes for i in 0..n { grouped.push(data[i * 2 + 1]); } // low bytes grouped } /// Ungroup: [h0,h1,...,l0,l1,...] → [h0,l0,h1,l1,...] pub fn byte_ungroup_bf16(data: &[u8]) -> Vec<u8>;
Why: High bytes (exponent) compress better together. ~20% better ratio.
pipeline
Compression pipeline combining byte grouping + zstd:
pub struct CheckpointCompressor { compressor: ZstdCompressor, } impl CheckpointCompressor { pub fn compress(&self, data: &[u8], dtype: DType) -> Result<Vec<u8>> { let grouped = match dtype { DType::BFloat16 | DType::Float16 => byte_group_bf16(data), _ => data.to_vec(), }; self.compressor.compress(&grouped) } pub fn decompress(&self, data: &[u8], dtype: DType, size: usize) -> Result<Vec<u8>>; }
formats
Read PyTorch checkpoint formats:
- PyTorch pickle formatstate_dict
- HuggingFace format (preferred)safetensors
Target Metrics
| Metric | Target |
|---|---|
| Compression ratio | ≥10x (lossless) |
| Throughput | ≥2.5 GiB/s |
| Memory overhead | ≤2x checkpoint size |
Key Dependencies
mithril-core = { workspace = true } zstd = { workspace = true } rayon = { workspace = true } # Parallel compression
Test Fixtures
- 10MB bf16 test datafixtures/checkpoints/small_model.bin
Testing
cargo test -p mithril-checkpoint cargo bench -p mithril-checkpoint
Implementation Order
- Implement
module with testsbytegroup - Implement
modulepipeline - Add format readers (safetensors first)
- Run benchmarks, optimize for throughput
- Update STATUS.md
Completion Criteria
- Compress/decompress roundtrip is bit-exact
- ≥10x compression on bf16 data
- ≥2.5 GiB/s throughput
- Unit tests pass
- STATUS.md updated to COMPLETE