Daft daft-udf-tuning

Name: daft-udf-tuning
Author: Eventual-Inc

Optimize Daft UDF performance. Invoke when user needs GPU inference, encounters slow UDFs, or asks about async/batch processing.

install

source · Clone the upstream repo

git clone https://github.com/Eventual-Inc/Daft

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Eventual-Inc/Daft "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/daft-udf-tuning" ~/.claude/skills/eventual-inc-daft-daft-udf-tuning && rm -rf "$T"

manifest: .claude/skills/daft-udf-tuning/SKILL.md

source content

Daft UDF Tuning

Optimize User-Defined Functions for performance.

UDF Types

Type	Decorator	Use Case
Stateless	`@daft.func`	Simple transforms. Use `async` for I/O-bound tasks.
Stateful	`@daft.cls`	Expensive init (e.g., loading models). Supports `gpus=N` .
Batch	`@daft.func.batch`	Vectorized CPU/GPU ops (NumPy/PyTorch). Faster.

Quick Recipes

1. Async I/O (Web APIs)

@daft.func
async def fetch(url: str):
    async with aiohttp.ClientSession() as s:
        return await s.get(url).text()

2. GPU Batch Inference (PyTorch/Models)

@daft.cls(gpus=1)
class Classifier:
    def __init__(self):
        self.model = load_model().cuda() # Run once per worker

    @daft.method.batch(batch_size=32)
    def predict(self, images):
        return self.model(images.to_pylist())

# Run with concurrency
df.with_column("preds", Classifier(max_concurrency=4).predict(df["img"]))

Tuning Keys

max_concurrency
: Total parallel UDF instances.
gpus=N
: GPU request per instance.
batch_size
: Rows per call. Too small = overhead; too big = OOM.
into_batches(N)
: Pre-slice partitions if memory is tight.