Claude-code-plugins-plus together-hello-world
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/together-pack/skills/together-hello-world" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-together-hello-world && rm -rf "$T"
manifest:
plugins/saas-packs/together-pack/skills/together-hello-world/SKILL.mdsource content
Together AI Hello World
Overview
Run chat completions with open-source models via Together AI's OpenAI-compatible API. Supports Llama, Mixtral, Qwen, and 100+ models. Key endpoints:
/v1/chat/completions, /v1/completions, /v1/embeddings, /v1/images/generations.
Instructions
Step 1: Chat Completions
from together import Together client = Together() response = client.chat.completions.create( model="meta-llama/Llama-3.3-70B-Instruct-Turbo", messages=[ {"role": "system", "content": "You are a helpful coding assistant."}, {"role": "user", "content": "Write a Python function to calculate fibonacci numbers"}, ], max_tokens=500, temperature=0.7, top_p=0.9, ) print(response.choices[0].message.content) print(f"Tokens: {response.usage.prompt_tokens} in, {response.usage.completion_tokens} out")
Step 2: Streaming
stream = client.chat.completions.create( model="meta-llama/Llama-3.3-70B-Instruct-Turbo", messages=[{"role": "user", "content": "Explain quantum computing"}], stream=True, max_tokens=200, ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)
Step 3: Image Generation
response = client.images.generate( model="black-forest-labs/FLUX.1-schnell-Free", prompt="A sunset over mountains, digital art style", width=1024, height=768, n=1, ) print(f"Image URL: {response.data[0].url}")
Step 4: Embeddings
response = client.embeddings.create( model="togethercomputer/m2-bert-80M-8k-retrieval", input=["Hello world", "Together AI is great"], ) print(f"Embedding dim: {len(response.data[0].embedding)}")
Step 5: Node.js with OpenAI Client
import OpenAI from 'openai'; const together = new OpenAI({ apiKey: process.env.TOGETHER_API_KEY, baseURL: 'https://api.together.xyz/v1', }); const chat = await together.chat.completions.create({ model: 'meta-llama/Llama-3.3-70B-Instruct-Turbo', messages: [{ role: 'user', content: 'Hello!' }], }); console.log(chat.choices[0].message.content);
Output
def fibonacci(n): if n <= 1: return n return fibonacci(n-1) + fibonacci(n-2) Tokens: 28 in, 45 out
Error Handling
| Error | Cause | Solution |
|---|---|---|
| Wrong model ID | Check docs.together.ai/docs/inference-models |
| Empty response | max_tokens too low | Increase max_tokens |
| Too many requests | Implement backoff |
| Slow response | Large model | Try Turbo variant or smaller model |
Resources
Next Steps
Proceed to
together-local-dev-loop for development workflow.