Asi vertex-ai

Google Vertex AI via gcloud OAuth2. Call Gemini, Imagen, embeddings, and Model Garden models from the CLI. Use when users need generative AI, image generation, embeddings, or model inference.

install
source · Clone the upstream repo
git clone https://github.com/plurigrid/asi
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/vertex-ai" ~/.claude/skills/plurigrid-asi-vertex-ai && rm -rf "$T"
manifest: skills/vertex-ai/SKILL.md
source content

Vertex AI Skill

Call Google Vertex AI models from the CLI using

gcloud
OAuth2 authentication.

Prerequisites

  • gcloud
    CLI installed (via
    flox install google-cloud-sdk
    or standalone)
  • Authenticated:
    gcloud auth login
  • Project set:
    gcloud config set project PROJECT_ID
  • Vertex AI API enabled on the project

Authentication

Vertex AI requires OAuth2, not API keys. Get a bearer token:

ACCESS_TOKEN=$(gcloud auth print-access-token)

Tokens expire after ~60 minutes. Re-run to refresh.

Core Pattern

All Vertex AI calls follow this structure:

PROJECT=$(gcloud config get project)
REGION=us-central1
ACCESS_TOKEN=$(gcloud auth print-access-token)

curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/${MODEL}:${METHOD}" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d "$PAYLOAD"

Gemini — Text Generation

MODEL=gemini-2.0-flash
curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/${MODEL}:generateContent" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"role": "user", "parts": [{"text": "Your prompt here"}]}]
  }' | jq -r '.candidates[0].content.parts[0].text'

CRITICAL: Always include

"role": "user"
in contents — omitting it causes a 400 error.

Available Gemini Models

ModelUse Case
gemini-2.0-flash
Fast, general-purpose (recommended default)
gemini-2.0-pro
Complex reasoning, longer context
gemini-1.5-pro
1M token context window
gemini-1.5-flash
Fast, cost-effective

Generation Config

{
  "contents": [{"role": "user", "parts": [{"text": "prompt"}]}],
  "generationConfig": {
    "temperature": 0.7,
    "topP": 0.95,
    "topK": 40,
    "maxOutputTokens": 2048,
    "candidateCount": 1
  }
}

System Instructions

{
  "contents": [{"role": "user", "parts": [{"text": "prompt"}]}],
  "systemInstruction": {
    "parts": [{"text": "You are a helpful coding assistant."}]
  }
}

Gemini — Multimodal (Image + Text)

# Base64 encode an image
IMG_B64=$(base64 -i image.png)

curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/gemini-2.0-flash:generateContent" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"contents\": [{\"role\": \"user\", \"parts\": [
      {\"text\": \"Describe this image\"},
      {\"inlineData\": {\"mimeType\": \"image/png\", \"data\": \"$IMG_B64\"}}
    ]}]
  }" | jq -r '.candidates[0].content.parts[0].text'

Embeddings

curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/text-embedding-005:predict" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "instances": [{"content": "Text to embed"}]
  }' | jq '.predictions[0].embeddings.values[:5]'

Imagen — Image Generation

curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/imagen-3.0-generate-002:predict" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "instances": [{"prompt": "A cat wearing a space helmet, digital art"}],
    "parameters": {"sampleCount": 1}
  }' | jq -r '.predictions[0].bytesBase64Encoded' | base64 -d > output.png

Streaming

For streaming responses, use

streamGenerateContent
:

curl -s "https://${REGION}-aiplatform.googleapis.com/v1/projects/${PROJECT}/locations/${REGION}/publishers/google/models/gemini-2.0-flash:streamGenerateContent" \
  -H "Authorization: Bearer $ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"contents": [{"role": "user", "parts": [{"text": "Write a haiku"}]}]}'

Multi-turn Conversation

{
  "contents": [
    {"role": "user", "parts": [{"text": "What is the capital of France?"}]},
    {"role": "model", "parts": [{"text": "Paris."}]},
    {"role": "user", "parts": [{"text": "What is its population?"}]}
  ]
}

Helper Script

For quick single-prompt calls:

vertex() {
  local model="${2:-gemini-2.0-flash}"
  local token=$(gcloud auth print-access-token)
  local project=$(gcloud config get project 2>/dev/null)
  curl -s "https://us-central1-aiplatform.googleapis.com/v1/projects/${project}/locations/us-central1/publishers/google/models/${model}:generateContent" \
    -H "Authorization: Bearer $token" \
    -H "Content-Type: application/json" \
    -d "{\"contents\":[{\"role\":\"user\",\"parts\":[{\"text\":$(echo "$1" | jq -Rs .)}]}]}" \
    | jq -r '.candidates[0].content.parts[0].text'
}

# Usage: vertex "explain quantum computing" gemini-2.0-pro

Error Reference

ErrorCauseFix
401 UNAUTHENTICATEDToken expired or API key used
gcloud auth print-access-token
400 "valid role: user, model"Missing
role
field in contents
Add
"role": "user"
403 PERMISSION_DENIEDAPI not enabled or no access
gcloud services enable aiplatform.googleapis.com
429 RESOURCE_EXHAUSTEDRate limit hitBack off and retry
404 NOT_FOUNDWrong model name or regionCheck model ID and use
us-central1

1Password Integration

Store and retrieve credentials securely:

# If using a service account key stored in 1Password:
export GOOGLE_APPLICATION_CREDENTIALS=$(op read "op://VaultName/GCP-SA-Key/credential" | base64 -d > /tmp/sa.json && echo /tmp/sa.json)
gcloud auth activate-service-account --key-file=$GOOGLE_APPLICATION_CREDENTIALS