Claude-code-plugins-plus coreweave-upgrade-migration
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/coreweave-pack/skills/coreweave-upgrade-migration" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-coreweave-upgrade-migration && rm -rf "$T"
manifest:
plugins/saas-packs/coreweave-pack/skills/coreweave-upgrade-migration/SKILL.mdsource content
CoreWeave Upgrade & Migration
Overview
CoreWeave is a GPU-specialized cloud provider running Kubernetes-native infrastructure. Migrations involve upgrading between GPU instance types (A100 to H100), updating CUDA driver versions, and handling Kubernetes API version changes across namespaces. Tracking API versions is critical because CoreWeave's instance type labels and resource quotas change between platform releases, and deploying to a deprecated instance class will cause scheduling failures.
Version Detection
import { KubeConfig, CoreV1Api } from "@kubernetes/client-node"; async function detectCoreWeaveVersion(): Promise<void> { const kc = new KubeConfig(); kc.loadFromDefault(); const k8sApi = kc.makeApiClient(CoreV1Api); // Check current namespace GPU allocations const pods = await k8sApi.listNamespacedPod("my-namespace"); for (const pod of pods.body.items) { const gpuClass = pod.spec?.nodeSelector?.["gpu.nvidia.com/class"]; const cudaVersion = pod.metadata?.labels?.["cuda-version"]; console.log(`Pod ${pod.metadata?.name}: GPU=${gpuClass}, CUDA=${cudaVersion}`); } // Detect deprecated instance types const deprecated = ["A100_PCIE_40GB", "V100_PCIE_16GB", "RTX_A5000"]; const activeGpus = pods.body.items .map((p) => p.spec?.nodeSelector?.["gpu.nvidia.com/class"]) .filter(Boolean); const stale = activeGpus.filter((g) => deprecated.includes(g!)); if (stale.length > 0) console.warn(`Deprecated GPU types in use: ${stale.join(", ")}`); }
Migration Checklist
- Review CoreWeave release notes for deprecated instance types
- Audit all deployments for
node selectorsgpu.nvidia.com/class - Verify CUDA version compatibility with target GPU (see matrix below)
- Update container base images to match new CUDA/cuDNN requirements
- Test inference latency on new GPU type in staging namespace
- Update resource requests (
) for new instance memorynvidia.com/gpu - Migrate persistent volumes if switching regions or availability zones
- Update Kubernetes API version in manifests (e.g.,
changes)apps/v1 - Validate HPA scaling behavior on new instance type throughput
- Run canary deployment with traffic split before full cutover
Schema Migration
// CoreWeave instance type labels changed in 2025 platform update // Old: gpu.nvidia.com/class: "A100_PCIE_80GB" // New: gpu.nvidia.com/class: "H100_SXM5_80GB" interface DeploymentMigration { oldSelector: Record<string, string>; newSelector: Record<string, string>; cudaMinVersion: string; } const GPU_MIGRATIONS: DeploymentMigration[] = [ { oldSelector: { "gpu.nvidia.com/class": "A100_PCIE_80GB" }, newSelector: { "gpu.nvidia.com/class": "H100_SXM5_80GB" }, cudaMinVersion: "12.4", }, { oldSelector: { "gpu.nvidia.com/class": "A100_SXM4_80GB" }, newSelector: { "gpu.nvidia.com/class": "H100_SXM5_80GB" }, cudaMinVersion: "12.4", }, ]; function migrateNodeSelector(manifest: any, migration: DeploymentMigration): any { const selector = manifest.spec?.template?.spec?.nodeSelector; if (!selector) return manifest; for (const [key, oldVal] of Object.entries(migration.oldSelector)) { if (selector[key] === oldVal) { selector[key] = migration.newSelector[key]; } } return manifest; }
Rollback Strategy
import { AppsV1Api, KubeConfig } from "@kubernetes/client-node"; async function rollbackDeployment(namespace: string, name: string): Promise<void> { const kc = new KubeConfig(); kc.loadFromDefault(); const appsApi = kc.makeApiClient(AppsV1Api); // Kubernetes rollout undo — reverts to previous revision const deployment = await appsApi.readNamespacedDeployment(name, namespace); const currentRevision = deployment.body.metadata?.annotations?.["deployment.kubernetes.io/revision"]; console.log(`Rolling back ${name} from revision ${currentRevision}`); // Patch to trigger rollback via revision annotation await appsApi.patchNamespacedDeployment(name, namespace, { spec: { template: { metadata: { annotations: { "kubectl.kubernetes.io/restartedAt": new Date().toISOString() } } } }, }, undefined, undefined, undefined, undefined, undefined, { headers: { "Content-Type": "application/strategic-merge-patch+json" } }); console.log(`Rollback initiated for ${name} in ${namespace}`); }
Error Handling
| Migration Issue | Symptom | Fix |
|---|---|---|
| GPU class not schedulable | Pod stuck in with | Verify instance type exists in target region; check quota |
| CUDA version mismatch | Container crashes with | Rebuild container with CUDA matching target GPU driver |
| Namespace quota exceeded | on deployment | Request quota increase for new instance type via CoreWeave dashboard |
| PVC migration failure | timeout on new node | Detach old PVC, recreate in target availability zone |
| API version deprecated | | Update manifest to and adjust spec fields |
Resources
Next Steps
For CI/CD pipeline integration, see
coreweave-ci-integration.