Claude-code-plugins-plus-skills coreweave-data-handling

install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/coreweave-pack/skills/coreweave-data-handling" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-coreweave-data-handling && rm -rf "$T"
manifest: plugins/saas-packs/coreweave-pack/skills/coreweave-data-handling/SKILL.md
source content

CoreWeave Data Handling

Overview

CoreWeave GPU cloud workloads involve large-scale data artifacts: model weights (multi-GB safetensors/GGUF), training datasets (parquet, TFRecord, WebDataset), checkpoint snapshots, and inference cache volumes. Data flows through Kubernetes PersistentVolumeClaims backed by region-specific storage classes. Compliance requires encryption at rest via the storage driver, namespace-scoped RBAC for volume access, and audit logging for any data egress from GPU nodes.

Data Classification

Data TypeSensitivityRetentionEncryption
Model weightsMediumUntil deprecatedAES-256 at rest
Training datasetsHigh (may contain PII)Per data licenseAES-256 + TLS in transit
Checkpoint snapshotsMedium30 days post-trainingAES-256 at rest
Inference cacheLowSession/TTLVolume-level encryption
HuggingFace tokensCriticalRotate quarterlyK8s Secret + KMS

Data Import

import { KubeConfig, BatchV1Api } from '@kubernetes/client-node';

async function importDataset(pvcName: string, sourceUrl: string, namespace: string) {
  const kc = new KubeConfig();
  kc.loadFromDefault();
  const batch = kc.makeApiClient(BatchV1Api);
  const job = {
    metadata: { name: `import-${Date.now()}`, namespace },
    spec: { template: { spec: {
      restartPolicy: 'Never',
      containers: [{ name: 'loader', image: 'python:3.11-slim',
        command: ['python3', '-c', `
import urllib.request, hashlib
dest = '/data/dataset.tar.gz'
urllib.request.urlretrieve('${sourceUrl}', dest)
print(f"SHA256: {hashlib.sha256(open(dest,'rb').read()).hexdigest()}")`],
        volumeMounts: [{ name: 'storage', mountPath: '/data' }],
      }],
      volumes: [{ name: 'storage', persistentVolumeClaim: { claimName: pvcName } }],
    }}}
  };
  await batch.createNamespacedJob(namespace, { body: job });
}

Data Export

async function exportCheckpoint(pvcName: string, destBucket: string, ns: string) {
  // Validate export destination is in approved region list
  const APPROVED_REGIONS = ['us-east-1', 'us-central-1', 'eu-west-1'];
  const region = destBucket.split('-').slice(0, 3).join('-');
  if (!APPROVED_REGIONS.some(r => destBucket.includes(r))) {
    throw new Error(`Export blocked: ${region} not in approved regions`);
  }
  // Stream from PVC → object storage with integrity check
  const exportCmd = `tar czf - /models | gsutil cp - gs://${destBucket}/export.tar.gz`;
  console.log(`Exporting from PVC ${pvcName} to ${destBucket}`);
  return exportCmd;
}

Data Validation

interface ModelArtifact {
  name: string; format: 'safetensors' | 'gguf' | 'bin' | 'pt';
  sizeBytes: number; sha256: string;
}

function validateArtifact(artifact: ModelArtifact): string[] {
  const errors: string[] = [];
  if (!artifact.name || artifact.name.length > 255) errors.push('Invalid artifact name');
  if (artifact.sizeBytes <= 0) errors.push('Size must be positive');
  if (!/^[a-f0-9]{64}$/.test(artifact.sha256)) errors.push('Invalid SHA-256 hash');
  if (!['safetensors', 'gguf', 'bin', 'pt'].includes(artifact.format)) errors.push(`Unsupported format`);
  return errors;
}

Compliance

  • All PVCs use encrypted storage classes (AES-256 at rest)
  • HuggingFace and API tokens stored in Kubernetes Secrets with KMS encryption
  • Namespace-scoped RBAC restricts volume mount access to authorized workloads
  • Data egress from GPU nodes logged via network policy audit
  • Training datasets with PII processed only in approved regions (data residency)
  • Checkpoint retention enforced via CronJob garbage collection (30-day default)
  • SOC 2 Type II audit trail for all storage provisioning and deletion events

Error Handling

IssueCauseFix
PVC pending indefinitelyStorage class unavailable in regionCheck
kubectl get sc
and switch to available class
Download job OOMKilledDataset exceeds container memory limitIncrease resource limits or use streaming download
Permission denied on volumeRBAC misconfigured for namespaceVerify ServiceAccount has PVC access via RoleBinding
Checksum mismatch after importPartial transfer or corruptionRe-run import job; enable retry with backoff
Secret not foundKMS key rotation or namespace mismatchVerify secret exists in target namespace with
kubectl get secret

Resources

Next Steps

See

coreweave-security-basics
.