Skillshub coreweave-hello-world
install
source · Clone the upstream repo
git clone https://github.com/ComeOnOliver/skillshub
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/jeremylongshore/claude-code-plugins-plus-skills/coreweave-hello-world" ~/.claude/skills/comeonoliver-skillshub-coreweave-hello-world && rm -rf "$T"
manifest:
skills/jeremylongshore/claude-code-plugins-plus-skills/coreweave-hello-world/SKILL.mdsource content
CoreWeave Hello World
Overview
Deploy your first GPU workload on CoreWeave: a simple inference service using vLLM or a batch CUDA job. CoreWeave runs Kubernetes on bare-metal GPU nodes with A100, H100, and L40 GPUs.
Prerequisites
- Completed
setupcoreweave-install-auth - kubectl configured with CoreWeave kubeconfig
- Namespace with GPU quota
Instructions
Step 1: Deploy a vLLM Inference Server
# vllm-inference.yaml apiVersion: apps/v1 kind: Deployment metadata: name: vllm-server spec: replicas: 1 selector: matchLabels: app: vllm-server template: metadata: labels: app: vllm-server spec: containers: - name: vllm image: vllm/vllm-openai:latest args: - "--model" - "meta-llama/Llama-3.1-8B-Instruct" - "--port" - "8000" ports: - containerPort: 8000 resources: limits: nvidia.com/gpu: 1 memory: 48Gi cpu: "8" requests: nvidia.com/gpu: 1 memory: 32Gi cpu: "4" env: - name: HUGGING_FACE_HUB_TOKEN valueFrom: secretKeyRef: name: hf-token key: token affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: gpu.nvidia.com/class operator: In values: ["A100_PCIE_80GB"] --- apiVersion: v1 kind: Service metadata: name: vllm-server spec: selector: app: vllm-server ports: - port: 8000 targetPort: 8000 type: ClusterIP
# Create HuggingFace token secret kubectl create secret generic hf-token --from-literal=token="${HF_TOKEN}" # Deploy kubectl apply -f vllm-inference.yaml kubectl get pods -w # Wait for Running state # Port-forward and test kubectl port-forward svc/vllm-server 8000:8000 & curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "meta-llama/Llama-3.1-8B-Instruct", "messages": [{"role": "user", "content": "Hello!"}]}'
Step 2: Batch GPU Job
# gpu-batch-job.yaml apiVersion: batch/v1 kind: Job metadata: name: gpu-benchmark spec: template: spec: restartPolicy: Never containers: - name: benchmark image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime command: ["python3", "-c"] args: - | import torch print(f"CUDA available: {torch.cuda.is_available()}") print(f"GPU: {torch.cuda.get_device_name(0)}") x = torch.randn(10000, 10000, device="cuda") y = torch.matmul(x, x) print(f"Matrix multiply result shape: {y.shape}") print("CoreWeave GPU test passed!") resources: limits: nvidia.com/gpu: 1 affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: gpu.nvidia.com/class operator: In values: ["A100_PCIE_80GB"]
kubectl apply -f gpu-batch-job.yaml kubectl logs job/gpu-benchmark --follow
Error Handling
| Error | Cause | Solution |
|---|---|---|
| Pod stuck Pending | No GPU capacity | Try different GPU type or check quota |
not found | Wrong base image | Use NVIDIA CUDA images |
| OOMKilled | Insufficient GPU memory | Use larger GPU (80GB A100) |
| Image pull error | Registry auth | Create imagePullSecret |
Resources
Next Steps
Proceed to
coreweave-local-dev-loop for development workflow setup.