Dust dust-llm
Step-by-step guide for adding support for a new LLM in Dust. Use when adding a new model, or updating a previous one.
install
source · Clone the upstream repo
git clone https://github.com/dust-tt/dust
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/dust-tt/dust "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/dust-llm" ~/.claude/skills/dust-tt-dust-dust-llm && rm -rf "$T"
manifest:
.claude/skills/dust-llm/SKILL.mdsource content
Adding Support for a New LLM Model
This skill guides you through adding support for a newly released LLM.
Quick Reference
Files to Modify
| File | Purpose |
|---|---|
| Model ID + configuration |
| Pricing per million tokens |
| Central registry |
| Router whitelist |
| SDK types |
| UI availability (optional) |
| Integration tests |
Prerequisites
Before adding, gather:
- Model ID: Exact provider identifier (e.g.,
)gpt-4-turbo-2024-04-09 - Context size: Total context window in tokens
- Pricing: Input/output cost per million tokens
- Capabilities: Vision, structured output, reasoning effort levels
- Tokenizer: Compatible tokenizer for token counting
Step-by-Step: Adding an OpenAI Model
Step 1: Add Model Configuration
Edit
front/types/assistant/models/openai.ts:
export const GPT_4_TURBO_2024_04_09_MODEL_ID = "gpt-4-turbo-2024-04-09" as const; export const GPT_4_TURBO_2024_04_09_MODEL_CONFIG: ModelConfigurationType = { providerId: "openai", modelId: GPT_4_TURBO_2024_04_09_MODEL_ID, displayName: "GPT 4 turbo", contextSize: 128_000, recommendedTopK: 32, recommendedExhaustiveTopK: 64, largeModel: true, description: "OpenAI's GPT 4 Turbo model for complex tasks (128k context).", shortDescription: "OpenAI's second best model.", isLegacy: false, isLatest: false, generationTokensCount: 2048, supportsVision: true, minimumReasoningEffort: "none", maximumReasoningEffort: "none", defaultReasoningEffort: "none", supportsResponseFormat: false, tokenizer: { type: "tiktoken", base: "cl100k_base" }, };
Step 2: Add Pricing
Edit
front/lib/api/assistant/token_pricing.ts:
const CURRENT_MODEL_PRICING: Record<BaseModelIdType, PricingEntry> = { // ... existing "gpt-4-turbo-2024-04-09": { input: 10.0, // USD per million input tokens output: 30.0, // USD per million output tokens cache_read_input_tokens: 1.0, // Optional: cached reads cache_creation_input_tokens: 12.5, // Optional: cache creation }, };
Step 3: Register in Central Registry
Edit
front/types/assistant/models/models.ts:
export const MODEL_IDS = [ // ... existing GPT_4_TURBO_2024_04_09_MODEL_ID, ] as const; export const SUPPORTED_MODEL_CONFIGS: ModelConfigurationType[] = [ // ... existing GPT_4_TURBO_2024_04_09_MODEL_CONFIG, ];
Step 4: Update Router Whitelist
Edit
front/lib/api/llm/clients/openai/types.ts:
export const OPENAI_WHITELISTED_MODEL_IDS = [ // ... existing GPT_4_TURBO_2024_04_09_MODEL_ID, ] as const;
Step 5: Update SDK Types
Edit
sdks/js/src/types.ts:
const ModelLLMIdSchema = FlexibleEnumSchema< // ... existing | "gpt-4-turbo-2024-04-09" >();
Step 6: Add to UI (Optional)
Edit
front/components/providers/types.ts:
export const USED_MODEL_CONFIGS: readonly ModelConfig[] = [ // ... existing GPT_4_TURBO_2024_04_09_MODEL_CONFIG, ] as const;
Step 7: Test (Mandatory)
Edit
front/lib/api/llm/tests/llm.test.ts:
const MODELS = { // ... existing [GPT_4_TURBO_2024_04_09_MODEL_ID]: { runTest: true, // Enable for testing providerId: "openai", }, };
Run test:
RUN_LLM_TEST=true npx vitest --config lib/api/llm/tests/vite.config.js lib/api/llm/tests/llm.test.ts --run
After test passes, set
runTest: false to avoid expensive CI runs.
Adding Anthropic Models
Same pattern with Anthropic-specific files:
- Addfront/types/assistant/models/anthropic.ts
and configCLAUDE_X_MODEL_ID
- Add tofront/lib/api/llm/clients/anthropic/types.tsANTHROPIC_WHITELISTED_MODEL_IDS
- Register in central registryfront/types/assistant/models/models.ts
- Add pricingfront/lib/api/assistant/token_pricing.ts
- Update SDK typessdks/js/src/types.ts- Test and validate
Model Configuration Properties
| Property | Description |
|---|---|
| Can process images |
| Supports structured output (JSON) |
| Min reasoning level ("none", "low", "medium", "high") |
| Max reasoning level |
| Default reasoning level |
| Tokenizer config for token counting |
Validation Checklist
- Model config added to provider file
- Pricing updated (input, output, cache if applicable)
- Registered in central registry (
+MODEL_IDS
)SUPPORTED_MODEL_CONFIGS - Router whitelist updated
- SDK types updated
- UI config added (if needed)
- Integration test passes
- Test disabled after validation
Troubleshooting
Model not in UI: Check
USED_MODEL_CONFIGS in front/components/providers/types.ts
API calls failing: Verify model ID matches provider's exact identifier, check router whitelist
Token counting errors: Validate context size and tokenizer configuration
Pricing issues: Ensure prices are per million tokens in USD
Reference
- See
andfront/types/assistant/models/openai.ts
for examplesanthropic.ts - Provider docs: OpenAI, Anthropic, Google, Mistral