Claude-skill-registry axiom-ios-ml

Use when deploying ANY machine learning model on-device, converting models to CoreML, compressing models, or implementing speech-to-text. Covers CoreML conversion, MLTensor, model compression (quantization/palettization/pruning), stateful models, KV-cache, multi-function models, async prediction, SpeechAnalyzer, SpeechTranscriber.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/axiom-ios-ml" ~/.claude/skills/majiayu000-claude-skill-registry-axiom-ios-ml && rm -rf "$T"

manifest: skills/data/axiom-ios-ml/SKILL.md

source content

iOS Machine Learning Router

You MUST use this skill for ANY on-device machine learning or speech-to-text work.

When to Use

Use this router when:

Converting PyTorch/TensorFlow models to CoreML
Deploying ML models on-device
Compressing models (quantization, palettization, pruning)
Working with large language models (LLMs)
Implementing KV-cache for transformers
Using MLTensor for model stitching
Building speech-to-text features
Transcribing audio (live or recorded)

Routing Logic

CoreML Work

Implementation patterns →

/skill coreml

Model conversion workflow
MLTensor for model stitching
Stateful models with KV-cache
Multi-function models (adapters/LoRA)
Async prediction patterns
Compute unit selection

API reference →

/skill coreml-ref

CoreML Tools Python API
MLModel lifecycle
MLTensor operations
MLComputeDevice availability
State management APIs
Performance reports

Diagnostics →

/skill coreml-diag

Model won't load
Slow inference
Memory issues
Compression accuracy loss
Compute unit problems

Speech Work

Implementation patterns →

/skill speech

SpeechAnalyzer setup (iOS 26+)
SpeechTranscriber configuration
Live transcription
File transcription
Volatile vs finalized results
Model asset management

Decision Tree

User asks about on-device ML or speech
  ├─ Machine learning?
  │   ├─ Implementing/converting? → coreml
  │   ├─ Need API reference? → coreml-ref
  │   └─ Debugging issues? → coreml-diag
  └─ Speech-to-text?
      └─ Any speech work → speech

Critical Patterns

coreml:

Model conversion (PyTorch → CoreML)
Compression (palettization, quantization, pruning)
Stateful KV-cache for LLMs
Multi-function models for adapters
MLTensor for pipeline stitching
Async concurrent prediction

coreml-diag:

Load failures and caching
Inference performance issues
Memory pressure from models
Accuracy degradation from compression

speech:

SpeechAnalyzer + SpeechTranscriber setup
AssetInventory model management
Live transcription with volatile results
Audio format conversion

Example Invocations

User: "How do I convert a PyTorch model to CoreML?" → Invoke:

/skill coreml

User: "Compress my model to fit on iPhone" → Invoke:

/skill coreml

User: "Implement KV-cache for my language model" → Invoke:

/skill coreml

User: "Model loads slowly on first launch" → Invoke:

/skill coreml-diag

User: "My compressed model has bad accuracy" → Invoke:

/skill coreml-diag

User: "Add live transcription to my app" → Invoke:

/skill speech

User: "Transcribe audio files with SpeechAnalyzer" → Invoke:

/skill speech

User: "What's MLTensor and how do I use it?" → Invoke:

/skill coreml-ref