Skillsbench parallel-processing
Parallel processing with joblib for grid search and batch computations. Use when speeding up computationally intensive tasks across multiple CPU cores.
install
source · Clone the upstream repo
git clone https://github.com/benchflow-ai/skillsbench
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/benchflow-ai/skillsbench "$T" && mkdir -p ~/.claude/skills && cp -r "$T/tasks/mars-clouds-clustering/environment/skills/parallel-processing" ~/.claude/skills/benchflow-ai-skillsbench-parallel-processing && rm -rf "$T"
manifest:
tasks/mars-clouds-clustering/environment/skills/parallel-processing/SKILL.mdsource content
Parallel Processing with joblib
Speed up computationally intensive tasks by distributing work across multiple CPU cores.
Basic Usage
from joblib import Parallel, delayed def process_item(x): """Process a single item.""" return x ** 2 # Sequential results = [process_item(x) for x in range(100)] # Parallel (uses all available cores) results = Parallel(n_jobs=-1)( delayed(process_item)(x) for x in range(100) )
Key Parameters
- n_jobs:
for all cores,-1
for sequential, or specific number1 - verbose:
(silent),0
(progress),10
(detailed)50 - backend:
(CPU-bound, default) or'loky'
(I/O-bound)'threading'
Grid Search Example
from joblib import Parallel, delayed from itertools import product def evaluate_params(param_a, param_b): """Evaluate one parameter combination.""" score = expensive_computation(param_a, param_b) return {'param_a': param_a, 'param_b': param_b, 'score': score} # Define parameter grid params = list(product([0.1, 0.5, 1.0], [10, 20, 30])) # Parallel grid search results = Parallel(n_jobs=-1, verbose=10)( delayed(evaluate_params)(a, b) for a, b in params ) # Filter results results = [r for r in results if r is not None] best = max(results, key=lambda x: x['score'])
Pre-computing Shared Data
When all tasks need the same data, pre-compute it once:
# Pre-compute once shared_data = load_data() def process_with_shared(params, data): return compute(params, data) # Pass shared data to each task results = Parallel(n_jobs=-1)( delayed(process_with_shared)(p, shared_data) for p in param_list )
Performance Tips
- Only worth it for tasks taking >0.1s per item (overhead cost)
- Watch memory usage - each worker gets a copy of data
- Use
to monitor progressverbose=10