Skills ai-provider-openai-sdk
Official OpenAI SDK patterns for TypeScript/Node.js — client setup, Chat Completions, Responses API, streaming, structured outputs, function calling, embeddings, vision, audio, and production best practices
git clone https://github.com/agents-inc/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/agents-inc/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/src/skills/ai-provider-openai-sdk" ~/.claude/skills/agents-inc-skills-ai-provider-openai-sdk-f39554 && rm -rf "$T"
src/skills/ai-provider-openai-sdk/SKILL.mdOpenAI SDK Patterns
Quick Guide: Use the official
npm package (v6+) to interact with OpenAI's API directly. Useopenai(Responses API) for new projects with built-in tools and server-side state, orclient.responses.create()(Chat Completions) for stateless chat flows. Useclient.chat.completions.create()andzodResponseFormatfor structured outputs. Useclient.chat.completions.parse()or.stream()for streaming. Supports GPT-5.x family, GPT-4o, o4-mini, embeddings, vision, audio, and batch processing.stream: true
<critical_requirements>
CRITICAL: Before Using This Skill
All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering,
, named constants)import type
(You MUST use the Responses API (
) for new projects -- it provides better performance, built-in tools, and server-side conversation state)client.responses.create()
(You MUST use
from zodResponseFormat()
for structured outputs -- do NOT manually construct JSON schemas)openai/helpers/zod
(You MUST handle errors using
and its subclasses -- never use bare catch blocks without error type checking)OpenAI.APIError
(You MUST configure appropriate retries and timeouts for production use -- the SDK retries 2 times by default on 429/5xx errors)
(You MUST never hardcode API keys -- always use environment variables via
)process.env.OPENAI_API_KEY
</critical_requirements>
Auto-detection: OpenAI, openai, client.chat.completions, client.responses.create, client.responses.parse, client.embeddings, client.audio, zodResponseFormat, zodTextFormat, zodFunction, zodResponsesFunction, runTools, GPT-5, GPT-4o, o4-mini, gpt-5-mini, text-embedding-3, whisper, tts, OPENAI_API_KEY, toFile
When to use:
- Building applications that call OpenAI models directly (GPT-5.x, GPT-4o, o4-mini, etc.)
- Implementing chat completions with streaming responses
- Using the Responses API for agentic workflows with built-in tools (web search, file search, code interpreter)
- Extracting structured data from LLM responses with Zod schema validation
- Implementing function calling / tool use with the Chat Completions or Responses API
- Creating embeddings for RAG pipelines or semantic search
- Processing images with vision models or audio with Whisper/TTS
- Running batch jobs for high-volume, cost-efficient processing
Key patterns covered:
- Client initialization and configuration (retries, timeouts, proxies)
- Chat Completions API (messages, streaming, function calling)
- Responses API (input, instructions, built-in tools, server-side state)
- Structured outputs with
andzodResponseFormatclient.chat.completions.parse() - Streaming with
,for await...of
helper, and event handling.stream() - Embeddings API (
,text-embedding-3-small
)text-embedding-3-large - Vision (image URLs, base64), Audio (Whisper transcription, TTS), Batch API
- Error handling, retries, timeouts, and production best practices
When NOT to use:
- Multi-provider applications where you need to switch between OpenAI, Anthropic, Google, etc. -- use a unified provider SDK instead
- React-specific chat UI hooks (
,useChat
) -- use a framework-integrated AI SDKuseCompletion - When you need a higher-level abstraction over multiple LLM providers
Examples Index
- Core: Setup & Configuration -- Client init, production config, Azure, error handling, request overrides
- Chat Completions -- Basic chat, multi-turn, token tracking, output length control
- Streaming --
,stream: true
helper, Responses API streaming, abort.stream() - Tool/Function Calling -- Manual tools,
,zodFunction
automation, Responses API toolsrunTools - Structured Output --
,zodResponseFormat
, refusal handlingzodTextFormat - Embeddings, Vision & Audio -- Semantic search, image analysis, transcription, TTS, batch processing
- Quick API Reference -- Model IDs, method signatures, error types, streaming events
<philosophy>
Philosophy
The official OpenAI SDK provides direct, low-level access to OpenAI's full API surface. It is the thinnest possible wrapper over the REST API, auto-generated from OpenAI's OpenAPI specification using Stainless.
Core principles:
- Direct API access -- No abstractions or provider layers. You get the exact API that OpenAI documents, with full TypeScript types. Every API feature is available immediately when OpenAI releases it.
- Two API paradigms -- The Responses API (
) is the newer, recommended API with built-in tools and server-side state. The Chat Completions API (client.responses.create()
) remains fully supported for stateless chat flows.client.chat.completions.create() - Built-in resilience -- The SDK handles retries (2 by default on 429/5xx), timeouts (10 min default), and auto-pagination out of the box.
- Streaming as a first-class pattern -- Use
for SSE-based streaming,stream: true
helper for event-based consumption, or.stream()
for simple iteration.for await...of - Type-safe structured outputs --
andzodResponseFormat()
convert Zod schemas to JSON Schema and parse responses, giving you validated, typed objects.client.chat.completions.parse()
When to use the OpenAI SDK directly:
- You only use OpenAI models and want the simplest, most direct integration
- You need access to OpenAI-specific features (Responses API, Batch, Realtime)
- You want minimal dependencies and zero abstraction overhead
- You need the latest API features on day one
When NOT to use:
- You need to switch between providers (OpenAI, Anthropic, Google) -- use a unified provider SDK
- You want React-specific chat UI hooks -- use a framework-integrated AI SDK
- You want a higher-level agent framework -- consider OpenAI Agents SDK (
)@openai/agents
<patterns>
Core Patterns
Pattern 1: Client Setup
Initialize the OpenAI client. It auto-reads
OPENAI_API_KEY from the environment.
// lib/openai.ts -- basic setup import OpenAI from "openai"; const client = new OpenAI(); export { client };
// lib/openai.ts -- production configuration const TIMEOUT_MS = 30_000; const MAX_RETRIES = 3; const client = new OpenAI({ timeout: TIMEOUT_MS, maxRetries: MAX_RETRIES });
Why good: Minimal setup, env var auto-detected, named constants for production settings
See: examples/core.md for Azure OpenAI, per-request overrides, error handling patterns
Pattern 2: Chat Completions API
Stateless text generation. You manage conversation history.
const completion = await client.chat.completions.create({ model: "gpt-4o", messages: [ { role: "developer", content: "You are a helpful coding assistant." }, { role: "user", content: "Explain TypeScript generics." }, ], }); console.log(completion.choices[0].message.content);
Why good: Clear message roles,
developer message for system instructions, direct content access
// BAD: No developer message, no error handling const res = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "do something" }], });
Why bad: No system instruction means unpredictable behavior, vague prompt
See: examples/chat.md for multi-turn, token tracking, output length control
Pattern 3: Responses API (Recommended for New Projects)
Newer API with built-in tools, server-side state, and better performance with reasoning models.
const response = await client.responses.create({ model: "gpt-4o", instructions: "You are a coding assistant.", input: "What are TypeScript generics?", }); console.log(response.output_text);
Why good: Clean separation of instructions and input,
output_text helper, simpler than messages array
// BAD: Using Chat Completions parameters with Responses API const response = await client.responses.create({ model: "gpt-4o", messages: [{ role: "user", content: "Hello" }], // WRONG: use 'input' });
Why bad: Responses API uses
input and instructions, not messages
Built-in Tools
Web search (
{ type: "web_search_preview" }), file search ({ type: "file_search" }), code interpreter ({ type: "code_interpreter" }). Chain conversations with previous_response_id and store: true.
See: examples/tools.md for Responses API function calling with tool outputs
Pattern 4: Streaming
Use streaming for user-facing responses.
// Chat Completions -- stream: true with for-await const stream = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "Explain async/await." }], stream: true, }); for await (const chunk of stream) { const content = chunk.choices[0]?.delta?.content; if (content) process.stdout.write(content); }
// Event-based with .stream() helper const stream = client.chat.completions.stream({ model: "gpt-4o", messages: [{ role: "user", content: "Tell me a story." }], }); stream.on("content", (delta) => process.stdout.write(delta)); const finalContent = await stream.finalContent();
Why good: Progressive output for better UX, event-based API for granular control
// BAD: Not consuming the stream const stream = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "Hello" }], stream: true, }); // Stream never consumed -- tokens are lost
Why bad: Stream must be consumed via iteration or event handlers, otherwise tokens are lost
See: examples/streaming.md for Responses API streaming, abort, stream methods
Pattern 5: Structured Outputs with Zod
Use
zodResponseFormat() and .parse() for type-safe structured responses.
import { zodResponseFormat } from "openai/helpers/zod"; import { z } from "zod"; const CalendarEvent = z.object({ name: z.string(), date: z.string(), participants: z.array(z.string()), }); const completion = await client.chat.completions.parse({ model: "gpt-4o", messages: [ { role: "developer", content: "Extract event details." }, { role: "user", content: "Alice and Bob meet next Tuesday for lunch." }, ], response_format: zodResponseFormat(CalendarEvent, "calendar_event"), }); const event = completion.choices[0].message.parsed; // Fully typed
Why good: Auto-converts schema, validates output, fully typed result, handles refusals
See: examples/structured-output.md for Responses API (
zodTextFormat), refusal handling, complex schemas
Pattern 6: Function Calling / Tool Use
Define functions the model can call. Use
zodFunction() for type-safe definitions.
import { zodFunction } from "openai/helpers/zod"; import { z } from "zod"; const GetWeatherParams = z.object({ location: z.string().describe("City name"), unit: z.enum(["celsius", "fahrenheit"]).default("celsius"), }); const completion = await client.chat.completions.parse({ model: "gpt-4o", messages: [{ role: "user", content: "Weather in Paris?" }], tools: [zodFunction({ name: "get_weather", parameters: GetWeatherParams })], }); const toolCall = completion.choices[0].message.tool_calls?.[0]; if (toolCall?.type === "function") { console.log(toolCall.function.parsed_arguments); // Typed from Zod }
Why good:
zodFunction provides type-safe argument parsing, .describe() guides the model
Use
runTools() for automated tool execution loops that handle the call-respond cycle automatically.
See: examples/tools.md for
runTools, manual tool definitions, Responses API function calling
Pattern 7: Embeddings, Vision & Audio
- Embeddings:
-- batch multiple inputs in one callclient.embeddings.create({ model: "text-embedding-3-small", input: [...] }) - Vision: Multi-part content array with
for URL or base64 images{ type: "image_url", image_url: { url } } - Audio:
for speech-to-text,client.audio.transcriptions.create()
for TTSclient.audio.speech.create() - Files:
withclient.files.create()
,ReadStream
(viaBuffer
), ortoFile
Responsefetch() - Batch API: Upload JSONL, create batch with
, poll for completion at 50% costclient.batches.create()
See: examples/embeddings-vision-audio.md for full examples with cosine similarity, base64 images, timestamps, TTS voice instructions, batch processing
Pattern 8: Error Handling
Always catch
OpenAI.APIError and its subclasses. Re-throw unexpected errors.
try { const completion = await client.chat.completions.create({ model: "gpt-4o", messages: [{ role: "user", content: "Hello" }], }); } catch (error) { if (error instanceof OpenAI.APIError) { console.error( `API Error [${error.status}]: ${error.message} (${error.request_id})`, ); // Check subclasses: RateLimitError, AuthenticationError, BadRequestError, etc. } else { throw error; // Re-throw non-API errors } }
Why good: Specific error types with status codes, request ID for debugging, re-throws unexpected errors
See: examples/core.md for full production error handling, stream errors, error type hierarchy
</patterns><performance>
Performance Optimization
Model Selection for Cost/Speed
General purpose -> gpt-5.4 (most capable) or gpt-4o (proven, lower cost) Cost-sensitive / high-vol -> gpt-5-mini or gpt-5-nano (cheapest) Complex reasoning -> gpt-5.4 or o4-mini Structured output -> gpt-5.4 or gpt-4o (best schema adherence) Embeddings -> text-embedding-3-small (cheapest) or text-embedding-3-large (highest quality) Transcription -> whisper-1 or gpt-4o-transcribe (higher accuracy) TTS -> tts-1 (fast) or tts-1-hd (quality) or gpt-4o-mini-tts (voice control) Batch processing -> gpt-5-mini at 50% batch discount
Key Optimization Patterns
- Track token usage via
for cost visibilitycompletion.usage - Check
to detect truncated outputfinish_reason === "length" - Use
for deterministic output (enables caching)temperature: 0 - Use
to cancel long-running requestsAbortController - Use Batch API for high-volume jobs at 50% cost reduction
<decision_framework>
Decision Framework
Which API to Use
Building a new application? +-- YES -> Need built-in tools (web search, file search, code interpreter)? | +-- YES -> Use Responses API (client.responses.create()) | +-- NO -> Need server-side conversation state? | +-- YES -> Use Responses API with store: true | +-- NO -> Either API works, prefer Responses for new code +-- Existing Chat Completions code? +-- Working fine? -> Keep using Chat Completions (fully supported) +-- Need new features? -> Consider migrating to Responses API
Which Model to Choose
What is your task? +-- General text generation -> gpt-5.4 (most capable) or gpt-4o (lower cost) +-- Fast + cheap simple tasks -> gpt-5-mini or gpt-5-nano +-- Complex reasoning / math -> gpt-5.4 or o4-mini +-- Structured output -> gpt-5.4 or gpt-4o (best schema adherence) +-- Vision (images) -> gpt-5.4 or gpt-4o +-- Embeddings -> text-embedding-3-small (default) or text-embedding-3-large +-- Transcription -> whisper-1 or gpt-4o-transcribe +-- Text-to-speech -> tts-1 (fast) or gpt-4o-mini-tts (voice instructions) +-- Batch processing -> gpt-5-mini (cheapest at 50% batch discount)
Streaming vs Non-Streaming
Is the response user-facing? +-- YES -> Use streaming (stream: true or .stream()) | +-- Need event-level control? -> .stream() with event handlers | +-- Simple text output? -> stream: true with for await +-- NO -> Use non-streaming +-- Background processing -> client.chat.completions.create() +-- Structured output -> client.chat.completions.parse() +-- High volume -> Batch API
When to Use This SDK vs a Provider-Agnostic SDK
Do you need multiple LLM providers (OpenAI + others)? +-- YES -> Not this skill's scope -- use a unified provider SDK +-- NO -> Do you need OpenAI-specific features? +-- YES -> Use OpenAI SDK directly | Examples: Responses API, Batch API, | Realtime API, built-in web search/file search +-- NO -> OpenAI SDK is simplest for OpenAI-only use
</decision_framework>
<red_flags>
RED FLAGS
High Priority Issues:
- Hardcoding API keys instead of using environment variables (security breach risk)
- Using bare
blocks without checkingcatch
(hides API errors)OpenAI.APIError - Not consuming streams returned by
(tokens are silently lost)stream: true - Using
on completion content withoutJSON.parse()
(fragile, no validation)zodResponseFormat - Sending full conversation history every request when Responses API's
could manage stateprevious_response_id
Medium Priority Issues:
- Not setting
/maxRetries
for production deployments (10 min default timeout may be too long)timeout - Missing
role message (no system instruction = unpredictable output style)developer - Using deprecated
role instead ofsystem
role in Chat Completionsdeveloper - Not checking
forfinish_reason
truncation'length' - Ignoring
data (no cost visibility)usage
Common Mistakes:
- Confusing Responses API (
) with Chat Completions (client.responses.create()
) parameters -- they use different shapesclient.chat.completions.create() - Using
parameter with Responses API (it usesmessages
andinput
)instructions - Using
with models that don't support structured outputs (need gpt-4o or later)response_format - Using
with reasoning models (o4-mini, gpt-5.x) -- usemax_tokens
insteadmax_completion_tokens - Not handling the case where
is undefinedcompletion.choices[0].message.tool_calls - Forgetting that
defaults to max 10 completions -- setrunTools()
explicitlymaxChatCompletions
Gotchas & Edge Cases:
- The SDK auto-retries on 429 (rate limit) and 5xx errors -- 2 retries by default. Disable with
if you handle retries yourself.maxRetries: 0
returns raw SSE chunks. Usestream: true
helper for a nicer event-based API..stream()
throwsclient.chat.completions.parse()
ifLengthFinishReasonError
isfinish_reason
and'length'
ifContentFilterFinishReasonError
.'content_filter'- Embedding responses return
(the SDK requests base64 by default and decodes via Float32 internally for performance). No conversion needed -- you get a plain number array.Array<number> - File uploads support
,ReadStream
,File
Response, andfetch()
helper -- use whichever matches your data source.toFile() - The Responses API's
enables server-side state but also means OpenAI stores your conversations. Setstore: true
for sensitive data.store: false
role replacesdeveloper
role in newer models (gpt-4o and later).system- Batch API has a 24h completion window and 50,000 request limit per batch.
- Audio transcription has a 25 MB file size limit.
- Zod schemas with
must usezodResponseFormat
-- the SDK handles this automatically.additionalProperties: false
andzodTextFormat
are NOT compatible with Zod v4 -- use Zod v3.x until the SDK adds v4 support.zodResponseFormat- The Assistants API is deprecated (sunset August 2026) -- use the Responses API for new code.
</red_flags>
<critical_reminders>
CRITICAL REMINDERS
All code must follow project conventions in CLAUDE.md (kebab-case, named exports, import ordering,
, named constants)import type
(You MUST use the Responses API (
) for new projects -- it provides better performance, built-in tools, and server-side conversation state)client.responses.create()
(You MUST use
from zodResponseFormat()
for structured outputs -- do NOT manually construct JSON schemas)openai/helpers/zod
(You MUST handle errors using
and its subclasses -- never use bare catch blocks without error type checking)OpenAI.APIError
(You MUST configure appropriate retries and timeouts for production use -- the SDK retries 2 times by default on 429/5xx errors)
(You MUST never hardcode API keys -- always use environment variables via
)process.env.OPENAI_API_KEY
Failure to follow these rules will produce insecure, unreliable, or poorly-typed AI integrations.
</critical_reminders>