Claude-code-plugins-plus together-local-dev-loop
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/together-pack/skills/together-local-dev-loop" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-together-local-dev-loop && rm -rf "$T"
manifest:
plugins/saas-packs/together-pack/skills/together-local-dev-loop/SKILL.mdsource content
Together AI Local Dev Loop
Overview
Local development workflow for Together AI inference API integration. Provides a fast feedback loop with mock chat completions, embeddings, and model listing endpoints so you can build AI-powered applications without consuming live API credits. Together AI is OpenAI-compatible, so the same client libraries work with both. Toggle between mock mode for rapid iteration and live mode for model evaluation.
Environment Setup
cp .env.example .env # Set your credentials: # TOGETHER_API_KEY=tog_xxxxxxxxxxxx # TOGETHER_BASE_URL=https://api.together.xyz/v1 # MOCK_MODE=true npm install express axios dotenv tsx typescript @types/node npm install -D vitest supertest @types/express # Or for Python: pip install together openai httpx pytest
Dev Server
// src/dev/server.ts import express from "express"; import { createProxyMiddleware } from "http-proxy-middleware"; const app = express(); app.use(express.json()); const MOCK = process.env.MOCK_MODE === "true"; if (!MOCK) { app.use("/v1", createProxyMiddleware({ target: process.env.TOGETHER_BASE_URL, changeOrigin: true, headers: { Authorization: `Bearer ${process.env.TOGETHER_API_KEY}` }, })); } else { const { mountMockRoutes } = require("./mocks"); mountMockRoutes(app); } app.listen(3009, () => console.log(`Together dev server on :3009 [mock=${MOCK}]`));
Mock Mode
// src/dev/mocks.ts — OpenAI-compatible mock responses for inference export function mountMockRoutes(app: any) { app.post("/v1/chat/completions", (req: any, res: any) => res.json({ id: "chatcmpl-mock-001", object: "chat.completion", model: req.body.model || "meta-llama/Llama-3-70b-chat-hf", choices: [{ index: 0, message: { role: "assistant", content: "This is a mock response from Together AI." }, finish_reason: "stop" }], usage: { prompt_tokens: 25, completion_tokens: 12, total_tokens: 37 }, })); app.post("/v1/embeddings", (req: any, res: any) => res.json({ object: "list", model: req.body.model || "togethercomputer/m2-bert-80M-8k-retrieval", data: [{ object: "embedding", index: 0, embedding: Array(768).fill(0).map(() => Math.random() * 2 - 1) }], })); app.get("/v1/models", (_req: any, res: any) => res.json({ data: [ { id: "meta-llama/Llama-3-70b-chat-hf", type: "chat", context_length: 8192 }, { id: "mistralai/Mixtral-8x22B-Instruct-v0.1", type: "chat", context_length: 65536 }, { id: "togethercomputer/m2-bert-80M-8k-retrieval", type: "embedding", context_length: 8192 }, ], })); }
Testing Workflow
npm run dev:mock & # Start mock server in background npm run test # Unit tests with vitest npm run test -- --watch # Watch mode for rapid iteration MOCK_MODE=false npm run test:integration # Integration test against real API
Debug Tips
- Together AI is OpenAI-compatible — set
tobase_url
for local devhttp://localhost:3009/v1 - Use
to discover available model IDs instead of hardcoding them/v1/models - Monitor
in responses to estimate costs before switching to live modeusage.total_tokens - Batch inference (
) runs at 50% cost but is async — poll for completion/v1/batch - Set
explicitly to avoid unexpectedly large responses and costsmax_tokens
Error Handling
| Issue | Cause | Fix |
|---|---|---|
| Invalid API key | Regenerate at api.together.xyz dashboard |
| Wrong model ID string | Use to verify |
| Too many requests per minute | Implement exponential backoff |
| Model overloaded or cold start | Retry with backoff after 5s |
| Dev server not running | Run first |
Resources
Next Steps
See
together-debug-bundle.