Skills vertex-ai-gemini
install
source · Clone the upstream repo
git clone https://github.com/TerminalSkills/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/TerminalSkills/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/vertex-ai-gemini" ~/.claude/skills/terminalskills-skills-vertex-ai-gemini && rm -rf "$T"
manifest:
skills/vertex-ai-gemini/SKILL.mdsafety · automated scan (low risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
- pip install
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content
Vertex AI — Gemini on Google Cloud
Overview
Vertex AI is Google Cloud's enterprise ML platform. It provides access to the same Gemini models as Google AI Studio, but with enterprise-grade features: IAM-based auth (no API keys), VPC Service Controls for data isolation, audit logging, fine-tuning capabilities, batch prediction jobs, and integration with GCP data services like BigQuery and Cloud Storage.
Vertex AI vs Google AI Studio
| Feature | Google AI Studio | Vertex AI |
|---|---|---|
| Auth | API Key | Service Account / IAM |
| Data residency | Limited | GCP regions |
| VPC isolation | ❌ | ✅ |
| Audit logging | ❌ | ✅ Cloud Audit Logs |
| Fine-tuning | ❌ | ✅ |
| Batch prediction | ❌ | ✅ |
| Pricing | Per token | Per token (different rates) |
| Quotas | Shared | Project-level quotas |
Setup
pip install google-cloud-aiplatform
# Authenticate gcloud auth application-default login # Or use service account export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Set project and location export GOOGLE_CLOUD_PROJECT=my-project-id export GOOGLE_CLOUD_LOCATION=us-central1
Instructions
Basic Gemini Inference
import vertexai from vertexai.generative_models import GenerativeModel vertexai.init(project="my-project-id", location="us-central1") model = GenerativeModel("gemini-2.0-flash-001") response = model.generate_content("Explain containerization in simple terms.") print(response.text)
Multi-Modal Inference
import vertexai from vertexai.generative_models import GenerativeModel, Part import base64 vertexai.init(project="my-project-id", location="us-central1") model = GenerativeModel("gemini-2.0-flash-001") # Analyze image from Cloud Storage gcs_image = Part.from_uri( uri="gs://my-bucket/product-photo.jpg", mime_type="image/jpeg", ) response = model.generate_content(["Describe this product:", gcs_image]) print(response.text) # Analyze local image with open("chart.png", "rb") as f: image_data = f.read() local_image = Part.from_data(data=image_data, mime_type="image/png") response = model.generate_content(["What trends does this chart show?", local_image]) print(response.text)
Streaming Responses
import vertexai from vertexai.generative_models import GenerativeModel vertexai.init(project="my-project-id", location="us-central1") model = GenerativeModel("gemini-2.0-flash-001") for chunk in model.generate_content("Write a product description for a smartwatch.", stream=True): print(chunk.text, end="", flush=True) print()
Chat Session
import vertexai from vertexai.generative_models import GenerativeModel, ChatSession vertexai.init(project="my-project-id", location="us-central1") model = GenerativeModel( model_name="gemini-2.0-flash-001", system_instruction="You are a GCP expert. Provide concise, actionable answers.", ) chat = model.start_chat() print(chat.send_message("How do I set up Cloud Run?").text) print(chat.send_message("What about environment variables?").text)
Function Calling
import vertexai from vertexai.generative_models import ( FunctionDeclaration, GenerativeModel, Tool, ) vertexai.init(project="my-project-id", location="us-central1") get_bq_query = FunctionDeclaration( name="run_bigquery_query", description="Run a SQL query on BigQuery and return results", parameters={ "type": "object", "properties": { "query": {"type": "string", "description": "SQL query to execute"}, "dataset": {"type": "string", "description": "BigQuery dataset name"}, }, "required": ["query"], }, ) tool = Tool(function_declarations=[get_bq_query]) model = GenerativeModel("gemini-2.0-flash-001", tools=[tool]) response = model.generate_content("How many users signed up last week?") if response.candidates[0].function_calls: fc = response.candidates[0].function_calls[0] print(f"Function: {fc.name}, Args: {dict(fc.args)}")
Fine-Tuning Gemini
import vertexai from vertexai.tuning import sft vertexai.init(project="my-project-id", location="us-central1") # Prepare training data in JSONL format in GCS: # {"messages": [{"role": "user", "content": "..."}, {"role": "model", "content": "..."}]} tuning_job = sft.train( source_model="gemini-2.0-flash-001", train_dataset="gs://my-bucket/training-data.jsonl", validation_dataset="gs://my-bucket/validation-data.jsonl", tuned_model_display_name="my-fine-tuned-gemini", epochs=3, learning_rate_multiplier=1.0, ) print(f"Tuning job: {tuning_job.resource_name}") print(f"State: {tuning_job.state}") # Wait for completion tuning_job.wait() print(f"Tuned model: {tuning_job.tuned_model_name}")
Batch Prediction
import vertexai from vertexai.generative_models import GenerativeModel from vertexai.preview.batch_prediction import BatchPredictionJob vertexai.init(project="my-project-id", location="us-central1") # Input JSONL format in GCS: # {"request": {"contents": [{"role": "user", "parts": [{"text": "Translate: Hello"}]}]}} job = BatchPredictionJob.submit( source_model="gemini-2.0-flash-001", input_dataset="gs://my-bucket/batch-inputs.jsonl", output_uri_prefix="gs://my-bucket/batch-outputs/", ) print(f"Batch job: {job.resource_name}") job.wait() print(f"Output: {job.output_location}")
IAM Setup for Service Account
# Create a service account for your app gcloud iam service-accounts create gemini-app-sa \ --display-name="Gemini App Service Account" # Grant Vertex AI User role gcloud projects add-iam-policy-binding my-project-id \ --member="serviceAccount:gemini-app-sa@my-project-id.iam.gserviceaccount.com" \ --role="roles/aiplatform.user" # Download key (for non-GCP environments) gcloud iam service-accounts keys create key.json \ --iam-account=gemini-app-sa@my-project-id.iam.gserviceaccount.com
VPC Service Controls (Enterprise Isolation)
# When VPC SC is enabled, all API calls must originate from within the perimeter # Configure the SDK to use private endpoints: import vertexai vertexai.init( project="my-project-id", location="us-central1", api_endpoint="us-central1-aiplatform.googleapis.com", # Regional endpoint )
Available Gemini Models on Vertex AI
| Model ID | Notes |
|---|---|
| Latest Flash, fast + capable |
| 2M context, most capable |
| 1M context, balanced |
| Latest embeddings (768 dims) |
Use
gemini-2.0-flash-001 (version pinned) in production to avoid unexpected model changes.
Guidelines
- Always pin model versions (e.g.,
notgemini-2.0-flash-001
) in production for stability.gemini-2.0-flash - Use Application Default Credentials (
) during development.gcloud auth application-default login - In GKE or Cloud Run, use Workload Identity — no service account keys needed.
- Fine-tuning requires a training JSONL with
format and at least 100 examples.messages - Batch prediction is cost-effective for offline bulk inference (no streaming).
- Enable Cloud Audit Logs on the
service for compliance.aiplatform.googleapis.com - Vertex AI supports regional endpoints — choose a region to ensure data residency compliance.