Skills pytorch
install
source · Clone the upstream repo
git clone https://github.com/TerminalSkills/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/TerminalSkills/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/pytorch" ~/.claude/skills/terminalskills-skills-pytorch && rm -rf "$T"
manifest:
skills/pytorch/SKILL.mdsafety · automated scan (low risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
- eval/exec/Function constructor
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content
PyTorch
Overview
PyTorch is a deep learning framework for building and training neural networks with dynamic computation graphs and automatic differentiation. It provides tensor operations with GPU acceleration,
nn.Module for defining architectures, DataLoader for efficient data loading, mixed precision training for performance, and export tools (TorchScript, ONNX) for production deployment.
Instructions
- When defining models, subclass
withnn.Module
for layers and__init__
for computation, usingforward
for simple stacks and custom forward logic for complex architectures.nn.Sequential - When training, implement the standard loop: forward pass, loss computation,
,loss.backward()
,optimizer.step()
, with gradient clipping viaoptimizer.zero_grad()
for stability.clip_grad_norm_ - When loading data, subclass
withDataset
and__len__
, then use__getitem__
withDataLoader
andnum_workers=4
for GPU training throughput.pin_memory=True - When optimizing performance, use
on PyTorch 2.0+ for 20-50% speedup, mixed precision withtorch.compile(model)
for halved memory and doubled throughput, andtorch.amp.autocast()
for multi-GPU training.DistributedDataParallel - When doing transfer learning, load pretrained models from
or Hugging Face, freeze the backbone, and replace the classifier head for your task.torchvision.models - When deploying, use
ortorch.export()
for production,torch.jit.trace()
for cross-framework compatibility, andtorch.onnx.export()
for INT8 inference speedup.torch.quantization
Examples
Example 1: Fine-tune a vision model for image classification
User request: "Fine-tune a pretrained ResNet for classifying product images"
Actions:
- Load
and freeze all layers except the final classifierresnet50(weights=ResNet50_Weights.DEFAULT) - Replace the classifier head with
nn.Linear(2048, num_classes) - Set up DataLoader with image augmentation transforms (RandomCrop, ColorJitter, Normalize)
- Train with AdamW, CosineAnnealingLR scheduler, and mixed precision
Output: A fine-tuned image classifier with production-quality accuracy and efficient mixed-precision training.
Example 2: Train a text classification model with Hugging Face
User request: "Build a sentiment analysis model using a pretrained transformer"
Actions:
- Load
with a classification headAutoModel.from_pretrained("bert-base-uncased") - Tokenize the dataset using
and create a DataLoaderAutoTokenizer - Fine-tune with AdamW, linear warmup scheduler, and gradient clipping
- Export the trained model with
for production servingtorch.export()
Output: A sentiment analysis model fine-tuned on custom data and exported for production inference.
Guidelines
- Use
on PyTorch 2.0+ for a free 20-50% speedup with one line.torch.compile(model) - Use
overAdamW
for correct weight decay implementation with modern architectures.Adam - Use mixed precision (
) for any GPU training to halve memory and double throughput.torch.amp - Move data to device in the training loop, not in the Dataset, to keep Dataset device-agnostic.
- Use
andmodel.eval()
during inference to prevent unnecessary gradient computation.torch.no_grad() - Use
in DataLoader when training on GPU to speed up CPU-to-GPU data transfer.pin_memory=True - Save
not the full model since state dicts are portable across code changes.model.state_dict()