Claude-code-plugins-plus-skills openrouter-pricing-basics

install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/openrouter-pack/skills/openrouter-pricing-basics" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-openrouter-pricing-basics && rm -rf "$T"
manifest: plugins/saas-packs/openrouter-pack/skills/openrouter-pricing-basics/SKILL.md
source content

OpenRouter Pricing Basics

Overview

OpenRouter charges per token with separate rates for prompt (input) and completion (output) tokens. Prices are listed per token in the models API (multiply by 1M for per-million rates). Credits are prepaid with a 5.5% processing fee ($0.80 minimum). Free models are available for testing and low-volume use.

How Pricing Works

  1. Buy credits at openrouter.ai/credits (5.5% fee, $0.80 minimum)
  2. Each request deducts
    (prompt_tokens * prompt_rate) + (completion_tokens * completion_rate)
  3. Check balance via
    GET /api/v1/auth/key
    or the dashboard
  4. Auto-topup is available to prevent service interruption

Query Model Pricing

# Get pricing for all models
curl -s https://openrouter.ai/api/v1/models | jq '.data[] | select(.id == "anthropic/claude-3.5-sonnet") | {
  id: .id,
  prompt_per_M: ((.pricing.prompt | tonumber) * 1000000),
  completion_per_M: ((.pricing.completion | tonumber) * 1000000),
  context: .context_length
}'
# → { "id": "anthropic/claude-3.5-sonnet", "prompt_per_M": 3, "completion_per_M": 15, "context": 200000 }

Cost Tiers (Representative)

TierExample ModelPrompt/1MCompletion/1MUse Case
Free
google/gemma-2-9b-it:free
$0.00$0.00Testing, prototyping
Budget
meta-llama/llama-3.1-8b-instruct
$0.06$0.06Simple Q&A, classification
Mid
openai/gpt-4o-mini
$0.15$0.60General purpose
Standard
anthropic/claude-3.5-sonnet
$3.00$15.00Complex reasoning, code
Premium
openai/o1
$15.00$60.00Deep reasoning

Calculate Request Cost

def estimate_cost(model_id: str, prompt_tokens: int, completion_tokens: int) -> float:
    """Calculate cost for a single request."""
    import requests
    models = requests.get("https://openrouter.ai/api/v1/models").json()["data"]
    model = next((m for m in models if m["id"] == model_id), None)
    if not model:
        raise ValueError(f"Model {model_id} not found")

    prompt_rate = float(model["pricing"]["prompt"])       # Cost per token
    completion_rate = float(model["pricing"]["completion"])
    return (prompt_tokens * prompt_rate) + (completion_tokens * completion_rate)

# Example: Claude 3.5 Sonnet, 1000 prompt + 500 completion tokens
cost = estimate_cost("anthropic/claude-3.5-sonnet", 1000, 500)
print(f"Estimated cost: ${cost:.6f}")  # ~$0.0105

Track Actual Cost Per Request

import requests

# Method 1: From response usage (estimate)
response = client.chat.completions.create(
    model="anthropic/claude-3.5-sonnet",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100,
)
# response.usage.prompt_tokens, response.usage.completion_tokens

# Method 2: Query generation endpoint (exact cost from OpenRouter)
gen = requests.get(
    f"https://openrouter.ai/api/v1/generation?id={response.id}",
    headers={"Authorization": f"Bearer {os.environ['OPENROUTER_API_KEY']}"},
).json()
print(f"Exact cost: ${gen['data']['total_cost']}")
print(f"Tokens: {gen['data']['tokens_prompt']} prompt + {gen['data']['tokens_completion']} completion")

Check Credit Balance

curl -s https://openrouter.ai/api/v1/auth/key \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" | jq '{
    credits_used: .data.usage,
    credit_limit: .data.limit,
    remaining: ((.data.limit // 0) - .data.usage),
    is_free_tier: .data.is_free_tier
  }'

Save Money with Variants

# :floor variant picks the cheapest provider for a model
response = client.chat.completions.create(
    model="anthropic/claude-3.5-sonnet:floor",  # Cheapest provider
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100,
)

# :free variant uses free providers (where available)
response = client.chat.completions.create(
    model="google/gemma-2-9b-it:free",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100,
)

Special Pricing

ItemPricing
Reasoning tokensCharged as output tokens at completion rate
Image inputsPer-image charge listed in
pricing.image
Per-request feeSome models charge a flat fee per request (
pricing.request
)
BYOKFirst 1M requests/month free; then 5% of normal provider cost
Free model limits50 req/day (free users), 1000 req/day (with $10+ credits)

Error Handling

HTTPCauseFix
402Insufficient creditsTop up at openrouter.ai/credits or use
:free
model
402Key credit limit reachedIncrease key limit or use a different key

Enterprise Considerations

  • Set per-key credit limits via the dashboard or provisioning API to isolate blast radius
  • Query
    /api/v1/generation?id=
    after each request for exact cost auditing
  • Use
    :floor
    variant to automatically pick the cheapest provider
  • Route simple tasks to budget models and complex tasks to premium models (see openrouter-model-routing)
  • Set
    max_tokens
    on every request to cap completion cost
  • Enable auto-topup to prevent service interruptions in production

References