Claude-code-plugins-plus coreweave-rate-limits
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/coreweave-pack/skills/coreweave-rate-limits" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-coreweave-rate-limits && rm -rf "$T"
manifest:
plugins/saas-packs/coreweave-pack/skills/coreweave-rate-limits/SKILL.mdsource content
CoreWeave Rate Limits
Overview
CoreWeave limits are primarily GPU quota-based rather than API rate limits. Each namespace has allocated GPU quotas per type.
Check GPU Quota
kubectl describe resourcequota -n my-namespace kubectl get resourcequota -o json | jq '.items[].status'
Inference Request Queuing
import asyncio from collections import deque class InferenceQueue: def __init__(self, max_concurrent: int = 10): self.semaphore = asyncio.Semaphore(max_concurrent) self.queue_depth = 0 async def inference(self, client, prompt: str) -> str: self.queue_depth += 1 async with self.semaphore: try: return await asyncio.to_thread(client.generate, prompt) finally: self.queue_depth -= 1
Resources
Next Steps
For security, see
coreweave-security-basics.