git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/michaelboeding/skills/model-council" ~/.claude/skills/comeonoliver-skillshub-model-council && rm -rf "$T"
skills/michaelboeding/skills/model-council/SKILL.mdModel Council: Multi-Model Consensus
Run the same problem through multiple AI models in parallel, collect their analysis only, then Claude Code synthesizes and decides the best approach.
Unlike code-council (which uses one model with multiple approaches), model-council leverages different model architectures for true ensemble diversity.
Critical: Analysis Only Mode
IMPORTANT: External models provide analysis and recommendations ONLY. They do NOT make code changes.
- External models: Analyze, suggest, reason, compare options
- Claude Code: Synthesizes all inputs, makes final decision, implements changes
This ensures:
- Claude Code remains in control of the codebase
- No conflicting changes from multiple sources
- Best ideas from all models, unified execution
Why Multi-Model?
Different models have different:
- Training data and knowledge cutoffs
- Reasoning patterns and biases
- Strengths (math, code, creativity, etc.)
When multiple independent models agree → High confidence the answer is correct.
Execution Modes
Mode 1: CLI Agents (Uses Your Existing Accounts)
Call CLI tools that use your logged-in accounts - leverages existing subscriptions!
| CLI Tool | Model | Status |
|---|---|---|
| Claude (this session) | ✅ Already running |
| OpenAI Codex | Requires setup |
| Google Gemini | Requires setup |
| Multi-model | Requires setup |
CLI Setup Instructions
OpenAI Codex CLI:
# Install via npm npm install -g @openai/codex # Login (uses browser auth) codex auth # Verify codex --version
Google Gemini CLI:
# Install via npm npm install -g @anthropic-ai/gemini-cli # Or use gcloud with Vertex AI gcloud auth application-default login # Verify gemini --version
Aider (Multi-model, Recommended):
# Install via pip pip install aider-chat # Configure with your API keys export OPENAI_API_KEY="sk-..." export ANTHROPIC_API_KEY="sk-ant-..." # Run with specific model aider --model gpt-4o --message "analyze this code"
Check What's Installed:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/detect_clis.py
Mode 2: API Calls (Pay per token)
Direct API calls - more reliable, works without CLI setup, but costs money.
Required environment variables:
- For Claude API (https://console.anthropic.com/)ANTHROPIC_API_KEY
- For GPT-4 API (https://platform.openai.com/api-keys)OPENAI_API_KEY
- For Gemini API (https://aistudio.google.com/apikey)GOOGLE_API_KEY
- For Grok API (https://console.x.ai/)XAI_API_KEY
Configuration
User Model Selection
Users can specify models inline:
model council with claude, gpt-4o, gemini: solve this problem model council (claude + codex): fix this bug model council all: use all available models
Default Models
If not specified, use all available:
- Check which CLI tools are installed
- Check which API keys are set
- Use what's available
Config File (Optional)
Users can create
~/.model-council.yaml:
# Preferred models (in order) models: - claude # Use Claude Code CLI (current session) - codex # Use Codex CLI if installed - gemini-cli # Use Gemini CLI if installed # Fallback to APIs if CLIs not available fallback_to_api: true # API models to use when falling back api_models: anthropic: claude-sonnet-4-20250514 openai: gpt-4o google: gemini-2.0-flash xai: grok-3 # Timeout per model (seconds) timeout: 120 # Run in parallel or sequential parallel: true
Workflow
Step 1: Parse Model Selection
Determine which models to use:
- Check user's inline specification (e.g., "with claude, gpt-4o")
- If none specified, check config file
- If no config, detect available CLIs and APIs
Step 2: Prepare the Prompt
Format the problem for each model with analysis-only instructions:
Analyze the following problem and provide your recommendations. DO NOT output code changes directly. Instead, provide: 1. Your analysis of the problem 2. Recommended approach(es) 3. Potential issues or edge cases to consider 4. Trade-offs between different solutions Problem: [user's problem here]
Key rules:
- Keep the core problem identical across models
- Explicitly request analysis, not implementation
- Include relevant context (code snippets, error messages)
Step 3: Execute in Parallel
Use the API council script to query multiple models:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/api_council.py \ --prompt "Analyze this problem and recommend solutions (do not implement): [problem]" \ --models "claude-sonnet,gpt-4o,gemini-flash"
Available models:
,claude-sonnet
- Anthropicclaude-opus
,gpt-4o
,gpt-4-turbo
- OpenAIo1
,gemini-flashgemini-pro
- xAIgrok
List all models:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/api_council.py --list-models
Step 4: Collect Responses
Gather all model responses with metadata:
- Model name and version
- Response time
- Token usage (if available)
- Full response
Step 5: Analyze Consensus
Compare responses looking for:
- Agreement: Do models produce the same answer/approach?
- Unique insights: Does one model catch something others missed?
- Disagreements: Where do models differ and why?
Step 6: Claude Code Synthesizes and Decides
Claude Code (this session) uses ultrathink to:
- Evaluate each model's analysis
- Identify the strongest reasoning and recommendations
- Note where models agree (high confidence) vs disagree (investigate further)
- Make the final decision on approach
- Implement the solution - only Claude Code makes code changes
This is the key difference from just asking one model:
- Multiple perspectives inform the decision
- Claude Code remains the single source of truth for implementation
- No conflicting changes from different models
Step 7: Deliver Results
Provide:
- Final synthesized answer (best combined solution)
- Consensus score (how many models agreed)
- Individual responses (for transparency)
- Insights (what each model contributed)
CLI Detection
To check available CLIs:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/model-council/scripts/detect_clis.py
This checks for:
- Claude Code CLIclaude
- OpenAI Codex CLIcodex
- Gemini CLIgemini
- Aider (multi-model)aider
- Cursor AI (if applicable)cursor
Comparison: code-council vs model-council
| Aspect | code-council | model-council |
|---|---|---|
| Models used | Claude only | Multiple (Claude, GPT, Gemini, etc.) |
| Diversity source | Different approaches | Different architectures |
| Cost | Free (uses current session) | Free (CLIs) or paid (APIs) |
| Speed | Fast (single model) | Slower (parallel calls) |
| Best for | Quick iterations | High-stakes decisions |
When to Use Each
Use code-council when:
- You want fast iterations
- The problem is well-defined
- You trust Claude's reasoning
Use model-council when:
- High-stakes code (production, security)
- You want architectural diversity
- Models might have different knowledge
- You want to verify Claude's answer
Error Handling
CLI not found: Skip that model, log warning, continue with others.
API key missing: Skip that provider, try CLI fallback if available.
Timeout: Return partial results, note which models timed out.
No models available: Error with setup instructions.
Example Output
## Model Council Analysis Results ### Consensus: HIGH (3/3 models agree on approach) ### Summary of Recommendations: All models recommend using a hash map for O(1) lookup. Key considerations raised: - Handle null/empty input (Claude, GPT-4o) - Consider memory vs speed tradeoff (Gemini) - Add input validation (all models) ### Individual Analyses: #### Claude Sonnet (API) Analysis: The bug is caused by off-by-one error in the loop boundary. Recommendation: Change `i <= len` to `i < len` Edge cases noted: Empty array, single element Confidence: High #### GPT-4o (API) Analysis: Loop iterates one element past array bounds. Recommendation: Fix loop condition, add bounds check Additional insight: Could also use forEach to avoid index errors Confidence: High #### Gemini Flash (API) Analysis: Array index out of bounds on final iteration. Recommendation: Adjust loop termination condition Reference: Similar to common off-by-one patterns Confidence: High ### Claude Code Decision: Based on consensus, implementing fix with: - Loop condition change (i < len) - Added null check for robustness - Unit test for edge cases [Claude Code now implements the solution]