Daft daft-udf-tuning
Optimize Daft UDF performance. Invoke when user needs GPU inference, encounters slow UDFs, or asks about async/batch processing.
install
source · Clone the upstream repo
git clone https://github.com/Eventual-Inc/Daft
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Eventual-Inc/Daft "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/daft-udf-tuning" ~/.claude/skills/eventual-inc-daft-daft-udf-tuning && rm -rf "$T"
manifest:
.claude/skills/daft-udf-tuning/SKILL.mdsource content
Daft UDF Tuning
Optimize User-Defined Functions for performance.
UDF Types
| Type | Decorator | Use Case |
|---|---|---|
| Stateless | | Simple transforms. Use for I/O-bound tasks. |
| Stateful | | Expensive init (e.g., loading models). Supports . |
| Batch | | Vectorized CPU/GPU ops (NumPy/PyTorch). Faster. |
Quick Recipes
1. Async I/O (Web APIs)
@daft.func async def fetch(url: str): async with aiohttp.ClientSession() as s: return await s.get(url).text()
2. GPU Batch Inference (PyTorch/Models)
@daft.cls(gpus=1) class Classifier: def __init__(self): self.model = load_model().cuda() # Run once per worker @daft.method.batch(batch_size=32) def predict(self, images): return self.model(images.to_pylist()) # Run with concurrency df.with_column("preds", Classifier(max_concurrency=4).predict(df["img"]))
Tuning Keys
: Total parallel UDF instances.max_concurrency
: GPU request per instance.gpus=N
: Rows per call. Too small = overhead; too big = OOM.batch_size
: Pre-slice partitions if memory is tight.into_batches(N)