Vibeship-spawner-skills gcp-cloud-run

GCP Cloud Run Skill

install
source · Clone the upstream repo
git clone https://github.com/vibeforge1111/vibeship-spawner-skills
manifest: integrations/gcp-cloud-run/skill.yaml
source content

GCP Cloud Run Skill

Expert-level Google Cloud Run and Cloud Run Functions development

id: gcp-cloud-run name: GCP Cloud Run description: | Specialized skill for building production-ready serverless applications on GCP. Covers Cloud Run services (containerized), Cloud Run Functions (event-driven), cold start optimization, and event-driven architecture with Pub/Sub.

version: 1.0.0 category: integrations tags:

  • gcp
  • cloud-run
  • serverless
  • containers
  • pubsub

principles:

  • Cloud Run for containers, Functions for simple event handlers
  • Optimize for cold starts with startup CPU boost and min instances
  • Set concurrency based on workload (start with 8, adjust)
  • Memory includes /tmp filesystem - plan accordingly
  • Use VPC Connector only when needed (adds latency)
  • Containers should start fast and be stateless
  • Handle signals gracefully for clean shutdown

patterns:

  • name: Cloud Run Service Pattern description: Containerized web service on Cloud Run when_to_use:

    • Web applications and APIs
    • Need any runtime or library
    • Complex services with multiple endpoints
    • Stateless containerized workloads structure: | project/ ├── Dockerfile ├── .dockerignore ├── src/ │ ├── index.js │ └── routes/ ├── package.json └── cloudbuild.yaml implementation: |
    # Dockerfile - Multi-stage build for smaller image
    FROM node:20-slim AS builder
    WORKDIR /app
    COPY package*.json ./
    RUN npm ci --only=production
    
    FROM node:20-slim
    WORKDIR /app
    
    # Copy only production dependencies
    COPY --from=builder /app/node_modules ./node_modules
    COPY src ./src
    COPY package.json ./
    
    # Cloud Run uses PORT env variable
    ENV PORT=8080
    EXPOSE 8080
    
    # Run as non-root user
    USER node
    
    CMD ["node", "src/index.js"]
    
    // src/index.js
    const express = require('express');
    const app = express();
    
    app.use(express.json());
    
    // Health check endpoint
    app.get('/health', (req, res) => {
      res.status(200).send('OK');
    });
    
    // API routes
    app.get('/api/items/:id', async (req, res) => {
      try {
        const item = await getItem(req.params.id);
        res.json(item);
      } catch (error) {
        console.error('Error:', error);
        res.status(500).json({ error: 'Internal server error' });
      }
    });
    
    // Graceful shutdown
    process.on('SIGTERM', () => {
      console.log('SIGTERM received, shutting down gracefully');
      server.close(() => {
        console.log('Server closed');
        process.exit(0);
      });
    });
    
    const PORT = process.env.PORT || 8080;
    const server = app.listen(PORT, () => {
      console.log(`Server listening on port ${PORT}`);
    });
    
    # cloudbuild.yaml
    steps:
      # Build the container image
      - name: 'gcr.io/cloud-builders/docker'
        args: ['build', '-t', 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA', '.']
    
      # Push the container image
      - name: 'gcr.io/cloud-builders/docker'
        args: ['push', 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA']
    
      # Deploy to Cloud Run
      - name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
        entrypoint: gcloud
        args:
          - 'run'
          - 'deploy'
          - 'my-service'
          - '--image=gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA'
          - '--region=us-central1'
          - '--platform=managed'
          - '--allow-unauthenticated'
          - '--memory=512Mi'
          - '--cpu=1'
          - '--min-instances=1'
          - '--max-instances=100'
          - '--concurrency=80'
          - '--cpu-boost'
    
    images:
      - 'gcr.io/$PROJECT_ID/my-service:$COMMIT_SHA'
    

    gcloud_deploy: |

    Direct gcloud deployment

    gcloud run deploy my-service
    --source .
    --region us-central1
    --allow-unauthenticated
    --memory 512Mi
    --cpu 1
    --min-instances 1
    --max-instances 100
    --concurrency 80
    --cpu-boost

  • name: Cloud Run Functions Pattern description: Event-driven functions (formerly Cloud Functions) when_to_use:

    • Simple event handlers
    • Pub/Sub message processing
    • Cloud Storage triggers
    • HTTP webhooks implementation: |
    // HTTP Function
    // index.js
    const functions = require('@google-cloud/functions-framework');
    
    functions.http('helloHttp', (req, res) => {
      const name = req.query.name || req.body.name || 'World';
      res.send(`Hello, ${name}!`);
    });
    
    // Pub/Sub Function
    const functions = require('@google-cloud/functions-framework');
    
    functions.cloudEvent('processPubSub', (cloudEvent) => {
      // Decode Pub/Sub message
      const message = cloudEvent.data.message;
      const data = message.data
        ? JSON.parse(Buffer.from(message.data, 'base64').toString())
        : {};
    
      console.log('Received message:', data);
    
      // Process message
      processMessage(data);
    });
    
    // Cloud Storage Function
    const functions = require('@google-cloud/functions-framework');
    
    functions.cloudEvent('processStorageEvent', async (cloudEvent) => {
      const file = cloudEvent.data;
    
      console.log(`Event: ${cloudEvent.type}`);
      console.log(`Bucket: ${file.bucket}`);
      console.log(`File: ${file.name}`);
    
      if (cloudEvent.type === 'google.cloud.storage.object.v1.finalized') {
        await processUploadedFile(file.bucket, file.name);
      }
    });
    
    # Deploy HTTP function
    gcloud functions deploy hello-http \
      --gen2 \
      --runtime nodejs20 \
      --trigger-http \
      --allow-unauthenticated \
      --region us-central1
    
    # Deploy Pub/Sub function
    gcloud functions deploy process-messages \
      --gen2 \
      --runtime nodejs20 \
      --trigger-topic my-topic \
      --region us-central1
    
    # Deploy Cloud Storage function
    gcloud functions deploy process-uploads \
      --gen2 \
      --runtime nodejs20 \
      --trigger-event-filters="type=google.cloud.storage.object.v1.finalized" \
      --trigger-event-filters="bucket=my-bucket" \
      --region us-central1
    
  • name: Cold Start Optimization Pattern description: Minimize cold start latency for Cloud Run when_to_use:

    • Latency-sensitive applications
    • User-facing APIs
    • High-traffic services implementation: |

    1. Enable Startup CPU Boost

    gcloud run deploy my-service \
      --cpu-boost \
      --region us-central1
    

    2. Set Minimum Instances

    gcloud run deploy my-service \
      --min-instances 1 \
      --region us-central1
    

    3. Optimize Container Image

    # Use distroless for minimal image
    FROM node:20-slim AS builder
    WORKDIR /app
    COPY package*.json ./
    RUN npm ci --only=production
    
    FROM gcr.io/distroless/nodejs20-debian12
    WORKDIR /app
    COPY --from=builder /app/node_modules ./node_modules
    COPY src ./src
    CMD ["src/index.js"]
    

    4. Lazy Initialize Heavy Dependencies

    // Lazy load heavy libraries
    let bigQueryClient = null;
    
    function getBigQueryClient() {
      if (!bigQueryClient) {
        const { BigQuery } = require('@google-cloud/bigquery');
        bigQueryClient = new BigQuery();
      }
      return bigQueryClient;
    }
    
    // Only initialize when needed
    app.get('/api/analytics', async (req, res) => {
      const client = getBigQueryClient();
      const results = await client.query({...});
      res.json(results);
    });
    

    5. Increase Memory (More CPU)

    # Higher memory = more CPU during startup
    gcloud run deploy my-service \
      --memory 1Gi \
      --cpu 2 \
      --region us-central1
    

    optimization_impact: startup_cpu_boost: "50% faster cold starts" min_instances: "Eliminates cold starts for traffic spikes" distroless_image: "Smaller attack surface, faster pull" lazy_init: "Defers heavy loading to first request"

  • name: Concurrency Configuration Pattern description: Proper concurrency settings for Cloud Run when_to_use:

    • Need to optimize instance utilization
    • Handle traffic spikes efficiently
    • Reduce cold starts implementation: |

    Understanding Concurrency

    # Default concurrency is 80
    # Adjust based on your workload
    
    # For I/O-bound workloads (most web apps)
    gcloud run deploy my-service \
      --concurrency 80 \
      --cpu 1
    
    # For CPU-bound workloads
    gcloud run deploy my-service \
      --concurrency 1 \
      --cpu 1
    
    # For memory-intensive workloads
    gcloud run deploy my-service \
      --concurrency 10 \
      --memory 2Gi
    

    Node.js Concurrency

    // Node.js is single-threaded but handles I/O concurrently
    // Use async/await for all I/O operations
    
    // GOOD - async I/O
    app.get('/api/data', async (req, res) => {
      const [users, products] = await Promise.all([
        fetchUsers(),
        fetchProducts()
      ]);
      res.json({ users, products });
    });
    
    // BAD - blocking operation
    app.get('/api/compute', (req, res) => {
      const result = heavyCpuOperation(); // Blocks other requests!
      res.json(result);
    });
    

    Python Concurrency with Gunicorn

    FROM python:3.11-slim
    WORKDIR /app
    COPY requirements.txt .
    RUN pip install --no-cache-dir -r requirements.txt
    COPY . .
    
    # 4 workers for concurrency
    CMD exec gunicorn --bind :$PORT --workers 4 --threads 2 main:app
    
    # main.py
    from flask import Flask
    app = Flask(__name__)
    
    @app.route('/api/data')
    def get_data():
        return {'status': 'ok'}
    

    concurrency_guidelines: "concurrency=1": "Only for CPU-bound or unsafe code" "concurrency=8-20": "Memory-intensive workloads" "concurrency=80": "Default, good for I/O-bound" "concurrency=250": "Maximum, for very lightweight handlers"

  • name: Pub/Sub Integration Pattern description: Event-driven processing with Cloud Pub/Sub when_to_use:

    • Asynchronous message processing
    • Decoupled microservices
    • Event-driven architecture implementation: |

    Push Subscription to Cloud Run

    # Create topic
    gcloud pubsub topics create orders
    
    # Create push subscription to Cloud Run
    gcloud pubsub subscriptions create orders-push \
      --topic orders \
      --push-endpoint https://my-service-xxx.run.app/pubsub \
      --ack-deadline 600
    
    // Handle Pub/Sub push messages
    const express = require('express');
    const app = express();
    app.use(express.json());
    
    app.post('/pubsub', async (req, res) => {
      // Verify the request is from Pub/Sub
      if (!req.body.message) {
        return res.status(400).send('Invalid Pub/Sub message');
      }
    
      try {
        // Decode message data
        const message = req.body.message;
        const data = message.data
          ? JSON.parse(Buffer.from(message.data, 'base64').toString())
          : {};
    
        console.log('Processing order:', data);
    
        await processOrder(data);
    
        // Return 200 to acknowledge
        res.status(200).send('OK');
      } catch (error) {
        console.error('Processing failed:', error);
        // Return 500 to trigger retry
        res.status(500).send('Processing failed');
      }
    });
    

    Publishing Messages

    const { PubSub } = require('@google-cloud/pubsub');
    const pubsub = new PubSub();
    
    async function publishOrder(order) {
      const topic = pubsub.topic('orders');
      const messageBuffer = Buffer.from(JSON.stringify(order));
    
      const messageId = await topic.publishMessage({
        data: messageBuffer,
        attributes: {
          type: 'order_created',
          priority: 'high'
        }
      });
    
      console.log(`Published message ${messageId}`);
      return messageId;
    }
    

    Dead Letter Queue

    # Create DLQ topic
    gcloud pubsub topics create orders-dlq
    
    # Update subscription with DLQ
    gcloud pubsub subscriptions update orders-push \
      --dead-letter-topic orders-dlq \
      --max-delivery-attempts 5
    
  • name: Cloud SQL Connection Pattern description: Connect Cloud Run to Cloud SQL securely when_to_use:

    • Need relational database
    • Migrating existing applications
    • Complex queries and transactions implementation: |
    # Deploy with Cloud SQL connection
    gcloud run deploy my-service \
      --add-cloudsql-instances PROJECT:REGION:INSTANCE \
      --set-env-vars INSTANCE_CONNECTION_NAME="PROJECT:REGION:INSTANCE" \
      --set-env-vars DB_NAME="mydb" \
      --set-env-vars DB_USER="myuser"
    
    // Using Unix socket connection
    const { Pool } = require('pg');
    
    const pool = new Pool({
      user: process.env.DB_USER,
      password: process.env.DB_PASS,
      database: process.env.DB_NAME,
      // Cloud SQL connector uses Unix socket
      host: `/cloudsql/${process.env.INSTANCE_CONNECTION_NAME}`,
      max: 5,  // Connection pool size
      idleTimeoutMillis: 30000,
      connectionTimeoutMillis: 10000,
    });
    
    app.get('/api/users', async (req, res) => {
      const client = await pool.connect();
      try {
        const result = await client.query('SELECT * FROM users LIMIT 100');
        res.json(result.rows);
      } finally {
        client.release();
      }
    });
    
    # Python with SQLAlchemy
    import os
    from sqlalchemy import create_engine
    
    def get_engine():
        instance_connection_name = os.environ["INSTANCE_CONNECTION_NAME"]
        db_user = os.environ["DB_USER"]
        db_pass = os.environ["DB_PASS"]
        db_name = os.environ["DB_NAME"]
    
        engine = create_engine(
            f"postgresql+pg8000://{db_user}:{db_pass}@/{db_name}",
            connect_args={
                "unix_sock": f"/cloudsql/{instance_connection_name}/.s.PGSQL.5432"
            },
            pool_size=5,
            max_overflow=2,
            pool_timeout=30,
            pool_recycle=1800,
        )
        return engine
    

    best_practices:

    • Use connection pooling (max 5-10 per instance)
    • Set appropriate idle timeouts
    • Handle connection errors gracefully
    • Consider Cloud SQL Proxy for local development
  • name: Secret Manager Integration description: Securely manage secrets in Cloud Run when_to_use:

    • API keys, database passwords
    • Service account keys
    • Any sensitive configuration implementation: |
    # Create secret
    echo -n "my-secret-value" | gcloud secrets create my-secret --data-file=-
    
    # Mount as environment variable
    gcloud run deploy my-service \
      --update-secrets=API_KEY=my-secret:latest
    
    # Mount as file volume
    gcloud run deploy my-service \
      --update-secrets=/secrets/api-key=my-secret:latest
    
    // Access mounted as environment variable
    const apiKey = process.env.API_KEY;
    
    // Access mounted as file
    const fs = require('fs');
    const apiKey = fs.readFileSync('/secrets/api-key', 'utf8');
    
    // Access via Secret Manager API (when not mounted)
    const { SecretManagerServiceClient } = require('@google-cloud/secret-manager');
    const client = new SecretManagerServiceClient();
    
    async function getSecret(name) {
      const [version] = await client.accessSecretVersion({
        name: `projects/${projectId}/secrets/${name}/versions/latest`
      });
      return version.payload.data.toString();
    }
    

anti_patterns:

  • name: CPU-Intensive Work Without Concurrency=1 description: Running CPU-bound code with high concurrency why_bad: | CPU is shared across concurrent requests. CPU-bound work will starve other requests, causing timeouts. bad_example: | // High concurrency (default 80) with CPU-bound work app.get('/api/process', (req, res) => { const result = heavyCpuOperation(); // Blocks CPU res.json(result); }); good_example: |

    Set concurrency=1 for CPU-bound workloads

    gcloud run deploy my-service --concurrency 1 --cpu 2

    Or move CPU work to Cloud Tasks/separate service

  • name: Writing Large Files to /tmp description: Using /tmp without considering memory limits why_bad: | /tmp is an in-memory filesystem. Large files consume your memory allocation and can cause OOM errors. bad_example: | // Writing large files to /tmp const largePdf = generateLargePdf(); // 500MB fs.writeFileSync('/tmp/report.pdf', largePdf); // Uses 500MB of your memory allocation! good_example: | // Stream directly to Cloud Storage const { Storage } = require('@google-cloud/storage'); const storage = new Storage();

    const file = storage.bucket('my-bucket').file('report.pdf'); const writeStream = file.createWriteStream(); generatePdfStream().pipe(writeStream);

  • name: Long-Running Background Tasks description: CPU is throttled between requests why_bad: | Cloud Run throttles CPU to near-zero when not handling requests. Background tasks will be extremely slow or stall. good_example: | // Use Cloud Tasks for background work const { CloudTasksClient } = require('@google-cloud/tasks'); const client = new CloudTasksClient();

    app.post('/api/order', async (req, res) => { // Respond immediately res.json({ status: 'processing' });

    // Queue background task
    await client.createTask({
      parent: queuePath,
      task: {
        httpRequest: {
          url: 'https://my-service/process-order',
          body: Buffer.from(JSON.stringify(req.body)).toString('base64')
        }
      }
    });
    

    });

references: