Claude-skill-registry gemini-api-guides
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/gemini-api-guides" ~/.claude/skills/majiayu000-claude-skill-registry-gemini-api-guides && rm -rf "$T"
manifest:
skills/data/gemini-api-guides/SKILL.mdsource content
Gemini API Skill
Build AI applications with Google's Gemini models and tools.
Quick Start
Installation
# Python pip install google-genai # JavaScript/Node.js npm install @google/genai # Go go get google.golang.org/genai
Environment Setup
export GEMINI_API_KEY="your-api-key"
Basic Usage
Python:
from google import genai client = genai.Client() response = client.models.generate_content( model="gemini-2.5-flash", contents="Your prompt here" ) print(response.text)
JavaScript:
import { GoogleGenAI } from "@google/genai"; const ai = new GoogleGenAI({}); const response = await ai.models.generateContent({ model: "gemini-2.5-flash", contents: "Your prompt here" }); console.log(response.text);
REST:
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \ -H "x-goog-api-key: $GEMINI_API_KEY" \ -H 'Content-Type: application/json' \ -d '{"contents": [{"parts": [{"text": "Your prompt here"}]}]}'
Model Selection
| Model | Best For | Context Window |
|---|---|---|
| Gemini 3 Pro | Most intelligent tasks, multimodal reasoning, agentic | See models-overview |
| Gemini 2.5 Pro | Complex reasoning, coding, extended thinking | 1M tokens |
| Gemini 2.5 Flash | Balanced performance, general tasks | 1M tokens |
| Gemini 2.5 Flash-Lite | High-volume, cost-sensitive, fastest | See models-overview |
| Imagen | High-fidelity image generation | N/A |
| Veo 3.1 | Video generation (8s, 720p/1080p with audio) | N/A |
| Nano Banana | Native image gen with Gemini 2.5 Flash | N/A |
| Nano Banana Pro | Native image gen with Gemini 3 Pro | N/A |
Reference Documentation Index
Getting Started
| Topic | File | Description |
|---|---|---|
| Setup & Libraries | getting-started.md | API keys, SDK installation, OpenAI compatibility |
Models & Pricing
| Topic | File | Description |
|---|---|---|
| Model Overview | models-overview.md | All models, capabilities, context windows |
| Pricing | api-pricing.md | Token costs, tool pricing |
| Rate Limits | rate-limits.md | RPM/TPM limits, quotas |
| Gemini 3 Guide | gemini-3.md | Gemini 3 specific features and best practices |
| Imagen | imagen.md | Image generation with Imagen model |
| Embeddings | embeddings.md | Text embeddings for search/RAG |
| Veo | veo.md | Video generation with Veo 3.1 (69K) |
| Lyria | lyria.md | Music generation with Lyria RealTime |
| Robotics | robotics.md | Gemini Robotics-ER 1.5 (42K) |
Core Capabilities
| Topic | File | Description |
|---|---|---|
| Text Generation | text-generation.md | Text generation, system instructions (38K) |
| Image Gen (Nano Banana) | image-generation-gemini.md | Native image generation with Gemini (LARGE: 174K) |
| Image Understanding | image-understanding.md | Vision, image analysis |
| Video Understanding | video-understanding.md | Video analysis, timestamps |
| Document Understanding | document-understanding.md | PDF and document processing |
| Speech Generation | speech-generation.md | Text-to-speech (TTS) |
| Audio Understanding | audio-understanding.md | Audio analysis, transcription |
Advanced Features
| Topic | File | Description |
|---|---|---|
| Thinking Mode | thinking.md | Extended reasoning capabilities |
| Thought Signatures | thought-signatures.md | EDGE CASE ONLY: Manual signature handling when NOT using official SDKs |
| Structured Outputs | structured-outputs.md | JSON schema responses |
| Function Calling | function-calling.md | Custom tool integration (54K) |
| Long Context | long-context.md | 1M+ token handling, context caching |
Tools
| Topic | File | Description |
|---|---|---|
| Tools Overview | tools-overview.md | Built-in tools summary, agent frameworks |
| Google Search | google-search.md | Web search grounding |
| Google Maps | google-maps.md | Location-aware grounding |
| Code Execution | code-execution.md | Python code execution tool |
| URL Context | url-context.md | URL content extraction |
| Computer Use | computer-use.md | Browser automation (preview) (44K) |
| File Search | file-search.md | RAG with document indexing |
Live API (Real-time Streaming)
| Topic | File | Description |
|---|---|---|
| Getting Started | live-api-getting-started.md | Low-latency voice/video interactions |
| Capabilities Guide | live-api-capabilities.md | Full capabilities and configurations (32K) |
| Tool Use | live-api-tools.md | Function calling & Search in Live API |
| Session Management | live-api-sessions.md | Session handling, time limits |
| Ephemeral Tokens | ephemeral-tokens.md | Short-lived auth for client-side WebSockets |
Guides
| Topic | File | Description |
|---|---|---|
| Batch API | batch-api.md | Async processing at 50% cost (47K) |
| Files API | files-api.md | Upload and manage media files (49K) |
| Context Caching | context-caching.md | Implicit & explicit caching for cost savings |
| Media Resolution | media-resolution.md | Control token allocation for media |
| Tokens | tokens.md | Understand and count tokens |
| Prompt Design | prompt-design.md | Prompt strategies and best practices (47K) |
| Logs & Datasets | logs-datasets.md | Enable logging, view in AI Studio |
| Data Logging & Sharing | data-logging-sharing.md | Storage and management of API logs |
| Safety Settings | safety-settings.md | Adjust safety filters |
| Safety Guidance | safety-guidance.md | Best practices for safe AI use |
Troubleshooting & Migration
| Topic | File | Description |
|---|---|---|
| Troubleshooting | troubleshooting.md | Diagnose and resolve common API issues (25K) |
| Vertex AI Comparison | vertex-ai-comparison.md | READ ONLY IF USER MENTIONS "VERTEX AI": Gemini Developer API vs Vertex AI differences |
Large Files - Search Patterns
For large reference files (>30K), use grep to find specific sections:
image-generation-gemini.md (174K):
grep -n "## " references/image-generation-gemini.md # List sections grep -n "edit" references/image-generation-gemini.md # Find editing info grep -n "style" references/image-generation-gemini.md # Find style transfer
veo.md (69K):
grep -n "## " references/veo.md # List sections grep -n "audio" references/veo.md # Find audio generation info
models-overview.md (67K):
grep -n "gemini-3" references/models-overview.md grep -n "context" references/models-overview.md
function-calling.md (54K):
grep -n "## " references/function-calling.md grep -n "parallel" references/function-calling.md # Parallel function calls
Common Patterns
Multimodal Input (Image + Text)
from google import genai from google.genai import types client = genai.Client() response = client.models.generate_content( model="gemini-2.5-flash", contents=[ types.Part.from_image(image_path), types.Part.from_text("Describe this image") ] )
Function Calling
tools = [ types.Tool(function_declarations=[{ "name": "get_weather", "description": "Get weather for a location", "parameters": { "type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"] } }]) ] response = client.models.generate_content( model="gemini-2.5-flash", contents="What's the weather in Paris?", config=types.GenerateContentConfig(tools=tools) )
Google Search Grounding
response = client.models.generate_content( model="gemini-2.5-flash", contents="What are the latest AI developments?", config=types.GenerateContentConfig( tools=[types.Tool(google_search=types.GoogleSearch())] ) )
Thinking Mode
response = client.models.generate_content( model="gemini-2.5-pro", contents="Solve this complex problem...", config=types.GenerateContentConfig( thinking_config=types.ThinkingConfig(thinking_budget_tokens=10000) ) )
Streaming
for chunk in client.models.generate_content_stream( model="gemini-2.5-flash", contents="Write a story" ): print(chunk.text, end="")
Key Concepts
Tool Execution Flow
Built-in tools (Google Search, Code Execution): Executed by Google
- Send prompt with tool config → Model executes tool → Response with grounded results
Custom tools (Function Calling): You execute
- Send prompt with function declarations → Model returns function call JSON
- You execute function, send result back → Model generates final response
Thought Signatures (Important)
- If using official SDKs with chat feature: Thought signatures are handled automatically. No action needed.
- If manually managing conversation history: Read thought-signatures.md for Gemini 3 Pro function calling requirements.
API Endpoints
| Endpoint | Purpose |
|---|---|
| Standard generation |
| Streaming |
| Embeddings |
| Token counting |
Base URL:
https://generativelanguage.googleapis.com