Skillsbench modal-gpu
Run Python code on cloud GPUs using Modal serverless platform. Use when you need A100/T4/A10G GPU access for training ML models. Covers Modal app setup, GPU selection, data downloading inside functions, and result handling.
install
source · Clone the upstream repo
git clone https://github.com/benchflow-ai/skillsbench
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/benchflow-ai/skillsbench "$T" && mkdir -p ~/.claude/skills && cp -r "$T/tasks/mhc-layer-impl/environment/skills/modal-gpu" ~/.claude/skills/benchflow-ai-skillsbench-modal-gpu && rm -rf "$T"
manifest:
tasks/mhc-layer-impl/environment/skills/modal-gpu/SKILL.mdsource content
Modal GPU Training
Overview
Modal is a serverless platform for running Python code on cloud GPUs. It provides:
- Serverless GPUs: On-demand access to T4, A10G, A100 GPUs
- Container Images: Define dependencies declaratively with pip
- Remote Execution: Run functions on cloud infrastructure
- Result Handling: Return Python objects from remote functions
Two patterns:
- Single Function: Simple script with
decorator@app.function - Multi-Function: Complex workflows with multiple remote calls
Quick Reference
| Topic | Reference |
|---|---|
| Basic Structure | Getting Started |
| GPU Options | GPU Selection |
| Data Handling | Data Download |
| Results & Outputs | Results |
| Troubleshooting | Common Issues |
Installation
pip install modal modal token set --token-id <id> --token-secret <secret>
Minimal Example
import modal app = modal.App("my-training-app") image = modal.Image.debian_slim(python_version="3.11").pip_install( "torch", "einops", "numpy", ) @app.function(gpu="A100", image=image, timeout=3600) def train(): import torch device = torch.device("cuda") print(f"Using GPU: {torch.cuda.get_device_name(0)}") # Training code here return {"loss": 0.5} @app.local_entrypoint() def main(): results = train.remote() print(results)
Common Imports
import modal from modal import Image, App # Inside remote function import torch import torch.nn as nn from huggingface_hub import hf_hub_download
When to Use What
| Scenario | Approach |
|---|---|
| Quick GPU experiments | (16GB, cheapest) |
| Medium training jobs | (24GB) |
| Large-scale training | (40/80GB, fastest) |
| Long-running jobs | Set or higher |
| Data from HuggingFace | Download inside function with |
| Return metrics | Return dict from function |
Running
# Run script modal run train_modal.py # Run in background modal run --detach train_modal.py
External Resources
- Modal Documentation: https://modal.com/docs
- Modal Examples: https://github.com/modal-labs/modal-examples