Learn-skills.dev update-llm-model-list
Audit and update the supported LLM model list in assets.py against litellm's registry (models.litellm.ai). Use when adding new models, pruning outdated ones, or verifying the list is correct.
install
source · Clone the upstream repo
git clone https://github.com/NeverSight/learn-skills.dev
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/agenta-ai/agenta/update-llm-model-list" ~/.claude/skills/neversight-learn-skills-dev-update-llm-model-list && rm -rf "$T"
manifest:
data/skills-md/agenta-ai/agenta/update-llm-model-list/SKILL.mdsource content
Update LLM Model List
Overview
The canonical model list lives in
sdk/agenta/sdk/assets.py → supported_llm_models.
It drives the model dropdown in the playground, cost metadata, and the model_to_provider_mapping.
The authoritative external source is
(2 600+ entries), which mirrors
https://models.litellm.ai/.litellm.model_cost
A pytest guard lives at:
sdk/oss/tests/pytest/unit/test_supported_llm_models.py
Key rules
- Every model must exist in
(direct key, or with provider prefix stripped).litellm.model_cost
→ litellm stores asanthropic/claude-*
(prefix is intentional for routing, stripped for cost lookup)claude-*
→ litellm stores ascohere/command-*command-*- All other providers keep their full prefix (e.g.
,gemini/
,groq/
)together_ai/
- Provider key (
,"anthropic"
, …) must match the Secrets API enum in"gemini"
(api/oss/src/core/secrets/enums.py
).StandardProviderKind - No duplicates within a provider list.
Step 1 — Check which current models are outdated / wrong
Run this with
uvx (no local install needed):
cat > /tmp/check_agenta_models.py << 'SCRIPT' # /// script # requires-python = ">=3.11" # dependencies = ["litellm"] # /// import litellm, sys # paste supported_llm_models here or import it from agenta.sdk.assets import supported_llm_models mc = set(litellm.model_cost.keys()) def exists(m): if m in mc: return True if "/" in m and m.split("/", 1)[1] in mc: return True return False fails = [] for provider, models in supported_llm_models.items(): for model in models: if not exists(model): fails.append((provider, model)) total = sum(len(v) for v in supported_llm_models.values()) print(f"Total models checked: {total}") if fails: for p, m in fails: print(f" MISSING [{p}] {m}") sys.exit(1) else: print("All models valid ✓") SCRIPT uvx --with litellm python /tmp/check_agenta_models.py 2>/dev/null
Alternatively, run the pytest unit test directly (requires agenta installed):
pytest sdk/oss/tests/pytest/unit/test_supported_llm_models.py -v
Step 2 — Find models missing from Agenta (big-3 audit)
This script finds models in litellm that Agenta doesn't list yet, filtered to remove noise (audio, video, embeddings, codex, snapshots):
cat > /tmp/find_missing.py << 'SCRIPT' # /// script # requires-python = ">=3.11" # dependencies = ["litellm"] # /// import litellm, re AGENTA_ANTHROPIC = set() # fill from assets.py (bare names, no prefix) AGENTA_OPENAI = set() # fill from assets.py AGENTA_GEMINI = set() # fill from assets.py (with gemini/ prefix) mc = set(litellm.model_cost.keys()) NOISE = [ "audio","tts","speech","whisper","transcri","realtime","diarize", "dall-e","image","video","veo","embed","moderat","search", "babbage","davinci","ada","instruct","codex","computer-use", "robotics","learnlm","gemma","live","v1:0", ] KEEP = {"gpt-4o","gpt-4o-mini"} DATED = re.compile(r"-\d{4}-\d{2}-\d{2}$") EXP = re.compile(r"exp-\d{4}|\d{2}-\d{2}$") def noise(m): if m in KEEP: return False return any(kw in m.lower() for kw in NOISE) def dated(m): return bool(DATED.search(m)) or bool(EXP.search(m)) def report(label, candidates, known, prefix=""): print(f"\n=== {label} ===") for m in sorted(candidates): bare = m[len(prefix):] if prefix else m if bare in known or m in known: continue tag = "[dated/exp]" if dated(m) else "[alias]" if m.endswith("-latest") else "*** MISSING ***" print(f" {m} {tag}") # Anthropic report("ANTHROPIC", [m for m in mc if m.startswith("claude-") and not noise(m)], AGENTA_ANTHROPIC) # OpenAI (no slash, starts with gpt- / o1 / o3 / o4) OAI = [m for m in mc if any(m.startswith(p) for p in ("gpt-","o1","o3","o4","chatgpt")) and "/" not in m and not noise(m)] report("OPENAI", OAI, AGENTA_OPENAI) # Gemini report("GEMINI", [m for m in mc if m.startswith("gemini/") and not noise(m)], AGENTA_GEMINI, prefix="gemini/") SCRIPT uvx --with litellm python /tmp/find_missing.py 2>/dev/null
Fill in the
sets from the current AGENTA_*
before running.assets.py
Step 3 — Edit assets.py
assets.pyFile:
sdk/agenta/sdk/assets.py
- Add models inside the correct provider list, newest first.
- For Gemini 1.5 models (still widely used): add under
."gemini" - For OpenAI o-series pro tiers (
,o1-pro
): add after their base model.o3-pro - For Groq: always cross-check
— Groq rotates its model catalogue frequently.litellm.groq_models - For DeepInfra / Together AI: check
/litellm.deepinfra_models
for current names.litellm.together_ai_models
Provider prefix conventions
| Provider key | Agenta prefix | litellm cost key prefix |
|---|---|---|
| | (no prefix) |
| | (no prefix) |
| | |
| | |
| | |
| (none) | (none) |
| | |
| | |
| | |
| | |
Step 4 — Run ruff then the test
# Format + lint uvx --from ruff==0.14.0 ruff format sdk/agenta/sdk/assets.py uvx --from ruff==0.14.0 ruff check --fix sdk/agenta/sdk/assets.py # Validate all models against litellm (no agenta install needed) uvx --with litellm python /tmp/check_agenta_models.py 2>/dev/null
All checks must pass before committing.
Related files
| File | Purpose |
|---|---|
| Canonical model list + cost metadata builder |
| Pytest guard (parametrized per model) |
| Provider keys — must stay in sync |
| Separate (shorter) model list for evaluator dropdown |