Claude-skill-registry auto-router-patterns
Auto router patterns for this project. Intelligent model selection via task classification, cost tier diversity, high-stakes override, weighted tier selection. Triggers on "auto router", "model selection", "classification", "cost tier", "exploration", "high stakes", "routing", "router".
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/auto-router-patterns" ~/.claude/skills/majiayu000-claude-skill-registry-auto-router-patterns && rm -rf "$T"
skills/data/auto-router-patterns/SKILL.mdAuto Router Patterns
System intelligently routes user messages to optimal models via task classification, cost-based diversity, and high-stakes safety overrides.
Classification with gpt-oss-120b
Router uses
gpt-oss-120b via Cerebras for fast classification (~1000 tokens/sec):
// From autoRouter.ts function getRouterModel() { const openai = createOpenAI({ apiKey: process.env.AI_GATEWAY_API_KEY, baseURL: "https://gateway.ai.cloudflare.com/v1/planetaryescape/blah-chat-dev-gateway/openai", }); return openai("gpt-oss-120b"); // Via Cerebras } const ROUTER_MODEL_ID = "openai:gpt-oss-120b";
Classification schema with generateObject:
// From autoRouter.ts const classificationSchema = z.object({ primaryCategory: z.enum(TASK_CATEGORIES), secondaryCategory: z.enum(TASK_CATEGORIES).optional().nullable(), complexity: z.enum(["simple", "moderate", "complex"]), requiresVision: z.boolean(), requiresLongContext: z.boolean(), requiresReasoning: z.boolean(), confidence: z.number().min(0).max(1), isHighStakes: z.boolean(), highStakesDomain: z.enum(HIGH_STAKES_DOMAINS).optional().nullable(), });
Task categories: coding, reasoning, creative, factual, analysis, conversation, multimodal, research.
Cost Tier Categorization
Models categorized by average pricing (input + output / 2):
// From autoRouter.ts type CostTier = "cheap" | "mid" | "premium"; function getCostTier(pricing: { input: number; output: number }): CostTier { const avgCost = (pricing.input + pricing.output) / 2; if (avgCost < 1.0) return "cheap"; if (avgCost < 5.0) return "mid"; return "premium"; }
Examples:
- Cheap: gpt-5-nano ($0.04/$0.16), gemini-2.0-flash ($0.1/$0.4)
- Mid: gpt-5-mini ($0.15/$0.6), claude-3.5-haiku ($0.8/$4.0)
- Premium: gpt-5 ($2.5/$10.0), claude-opus-4 ($15.0/$75.0)
Weighted Tier Selection by Complexity
Diversity via weighted random selection, NOT top-N then random:
// From autoRouter.ts const TIER_WEIGHTS: Record<string, Record<CostTier, number>> = { simple: { cheap: 0.6, mid: 0.25, premium: 0.15 }, moderate: { cheap: 0.5, mid: 0.3, premium: 0.2 }, complex: { cheap: 0.3, mid: 0.4, premium: 0.3 }, };
Critical: Groups ALL models by tier, not just top N. Simple tasks get cheap models 60% of time, premium 15%.
Selection logic:
// From autoRouter.ts function selectWithExploration( scoredModels: Array<{ modelId: string; score: number }>, classification: { complexity: string; isHighStakes?: boolean }, ) { // Group ALL models by tier const tiers: Record<CostTier, Array<{ modelId: string; score: number }>> = { cheap: [], mid: [], premium: [], }; for (const model of sorted) { const tier = getCostTier(MODEL_CONFIG[model.modelId].pricing); tiers[tier].push(model); } // Get weights for complexity const weights = TIER_WEIGHTS[classification.complexity] ?? TIER_WEIGHTS.simple; const roll = Math.random(); // Select tier based on weighted random if (roll < weights.cheap && tiers.cheap.length > 0) { selectedTier = "cheap"; } else if (roll < weights.cheap + weights.mid && tiers.mid.length > 0) { selectedTier = "mid"; explorationPick = true; } else if (tiers.premium.length > 0) { selectedTier = "premium"; explorationPick = true; } // Random selection within chosen tier const pool = tiers[selectedTier]; return pool[Math.floor(Math.random() * pool.length)]; }
High-Stakes Override
Medical/legal/financial/safety questions force premium tier for accuracy:
// From autoRouter.ts const HIGH_STAKES_DOMAINS = [ "medical", "legal", "financial", "safety", "mental_health", "privacy", "immigration", "domestic_abuse", ] as const; // HIGH-STAKES OVERRIDE at top of selectWithExploration if (classification.isHighStakes) { if (tiers.premium.length > 0) { const pool = tiers.premium; const picked = pool[Math.floor(Math.random() * pool.length)]; return { ...picked, explorationPick: false }; } // Fallback with warning if no premium models logger.warn("High-stakes query but no premium models available"); return { ...sorted[0], explorationPick: false }; }
Classification prompt emphasizes advice vs information:
// From routerPrompts.ts RULES: 1. Must seek ADVICE or ACTION, not just information 2. "What is a heart attack?" = NOT high stakes (educational) 3. "Am I having a heart attack?" = HIGH STAKES (medical) 4. "What does liability mean?" = NOT high stakes (definition) 5. "Can my employer fire me for this?" = HIGH STAKES (legal)
Diversity vs Top-Score Trade-off
System balances model quality with cost/speed diversity:
Scoring phase (autoRouter.ts):
- Base: category match score (0-100 from MODEL_PROFILES)
- Secondary category bonus: +30% of secondary score
- Complexity: simple tasks penalized 0.7x (prefer cheap), complex boosted 1.2x (prefer capable)
- Cost bias:
-(avgCost / 30) * (costBias / 100) * 20 - Speed bias:
+speedBonus * (speedBias / 100) - Stickiness: +25 if model already selected in conversation
- Reasoning bonus: +15 if task requires thinking and model has it
- Research bonus: +25 for Perplexity models on research tasks
Selection phase (selectWithExploration):
- NOT greedy (always top score)
- NOT pure random (chaos)
- Weighted probabilistic by cost tier AND complexity
- Ensures variety across conversations without sacrificing appropriateness
Excluded Models Tracking for Retries
Failed models excluded from retry attempts:
// From autoRouter.ts routeMessage args export const routeMessage = internalAction({ args: { // ... excludedModels: v.optional(v.array(v.string())), // Failed models }, handler: async (ctx, args) => { // Filter eligible models const eligibleModels = getEligibleModels( classification, args.currentContextTokens ?? 0, args.excludedModels, // ← Passed to filter ); // Check if all models exhausted if (eligibleModels.length === 0) { const fallbackModel = "openai:gpt-5-mini"; if (args.excludedModels?.includes(fallbackModel)) { throw new Error("All models exhausted including fallback"); } return { selectedModelId: fallbackModel, /* ... */ }; } }, }); function getEligibleModels( classification: TaskClassification, currentContextTokens: number, excludedModels?: string[], ): string[] { return Object.keys(MODEL_CONFIG).filter((modelId) => { // Exclude failed models from retry attempts if (excludedModels?.includes(modelId)) return false; // ... other filters }); }
Caller (chat.ts or generation retry logic) tracks failed models and passes them to router.
Key Files
- Main routing action, classification, selectionpackages/backend/convex/ai/autoRouter.ts
- MODEL_CONFIG, MODEL_PROFILES, category scorespackages/backend/convex/ai/modelProfiles.ts
- Classification prompt, reasoning templatepackages/backend/convex/ai/routerPrompts.ts
- Calls routeMessage when user has "auto" selectedpackages/backend/convex/chat.ts
- Retry logic with excludedModelspackages/backend/convex/generation.ts
Avoid
- Don't use top-N greedy selection - breaks diversity
- Don't skip high-stakes override - safety critical
- Don't hardcode tier weights - use complexity-based config
- Don't forget to track router usage for cost monitoring (recordTextGeneration)
- Don't use classification model for generation - gpt-oss-120b is internal-only