Showdown-claude-skill showdown-claude-skill
Showdown — pit Claude, ChatGPT, and Gemini against each other via CLIProxyAPI
git clone https://github.com/vanderheijden86/showdown-claude-skill
git clone --depth=1 https://github.com/vanderheijden86/showdown-claude-skill ~/.claude/skills/vanderheijden86-showdown-claude-skill-showdown-claude-skill
SKILL.mdYou are executing the /showdown skill. The user wants to pit different LLMs against each other with the same prompt.
Prerequisites Check
First, verify CLIProxyAPI is running:
curl -s -o /dev/null -w "%{http_code}" http://localhost:8317/v1/models 2>/dev/null || echo "NOT_RUNNING"
If NOT running, tell the user:
CLIProxyAPI is not running. Start it with:
orbrew services start cliproxyapiThen re-runcliproxyapi./showdown
If running, proceed.
Execute Comparison
Run the compare script with the user's prompt:
bash ~/.claude/skills/showdown/scripts/showdown.sh "$ARGUMENTS"
The script returns a JSON array with results from each enabled model. Each entry contains:
: Human-readable provider nameprovider
: Model ID usedmodel
: The model's response textresponse
: How long the request tookduration_seconds
: Error message if the request failed (null otherwise)error
: HTTP status codestatus_code
Format the Output
Present the results as a structured comparison:
For each model response:
Use this format:
## <Provider Name> (<model-id>) - <duration>s <response content>
After all responses — Comparison Analysis (FIXED TEMPLATE):
You MUST use this exact template structure for the analysis. Do not deviate.
## Comparison Analysis ### Agreement - <bullet points where models broadly agree> ### Disagreements | Topic | Claude | GPT | Gemini | |-------|--------|-----|--------| | <point of divergence> | <stance> | <stance> | <stance> | | ... | ... | ... | ... | ### Style & Approach | Dimension | Claude | GPT | Gemini | |-----------|--------|-----|--------| | Tone | <description> | <description> | <description> | | Length | <description> | <description> | <description> | | Structure | <description> | <description> | <description> | | Use of examples | <description> | <description> | <description> | ### Best Response **Winner:** <model name> **Reasoning:** <2-3 sentences explaining why> ### Additional Observations <optional: topic-specific insights that don't fit the template above>
Error handling:
- If a model failed, show:
**<Provider>**: Failed - <error message> - If all models failed, suggest checking CLIProxyAPI status and authentication
- If only some failed, show successful responses and note failures
Save Output (Prompt User)
After presenting the comparison, always ask the user whether they want to save the full output as a markdown file using
AskUserQuestion. Include a second question asking if they want to run judge mode.
If the user says yes to saving, save to
./showdown-output/ in the current working directory:
- Create the
directory if it doesn't exist (./showdown-output/
)mkdir -p - Generate filename:
using the current timestampshowdown-YYYY-MM-DD-HHMMSS.md - Write a markdown file with this structure:
# Showdown: <short summary of the prompt topic> **Date:** <YYYY-MM-DD HH:MM TZ> **Models:** <list of models used> **Prompt:** > <the original user prompt, blockquoted> --- ## <Provider 1> (<model-id>) - <duration>s <full response exactly as returned, preserving all formatting> --- ## <Provider 2> (<model-id>) - <duration>s <full response exactly as returned, preserving all formatting> --- ## <Provider 3> (<model-id>) - <duration>s <full response exactly as returned, preserving all formatting> --- ## Comparison Analysis ### Agreement <bullet points> ### Disagreements | Topic | Claude | GPT | Gemini | |-------|--------|-----|--------| | ... | ... | ... | ... | ### Style & Approach | Dimension | Claude | GPT | Gemini | |-----------|--------|-----|--------| | ... | ... | ... | ... | ### Best Response **Winner:** <model> **Reasoning:** <explanation> ### Additional Observations <if any>
- Tell the user the file path after saving.
Important: The saved markdown must contain the complete, unabridged responses from each model AND the full comparison analysis, formatted exactly as presented in the conversation. Do not summarize or truncate.
If the user also wants to run judge mode, tell them to run
/showdown judge.