Software_development_department cloud-run-puppeteer
install
source · Clone the upstream repo
git clone https://github.com/tranhieutt/software_development_department
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/tranhieutt/software_development_department "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/cloud-run-puppeteer" ~/.claude/skills/tranhieutt-software-development-department-cloud-run-puppeteer && rm -rf "$T"
manifest:
.claude/skills/cloud-run-puppeteer/SKILL.mdsource content
Cloud Run + Puppeteer Deployment Guide
Hard-won lessons from deploying Puppeteer to Cloud Run. These are non-obvious gotchas that cost significant debug time.
MUST: Use gen2 Execution Environment
Cloud Run gen1 uses gVisor sandbox — blocks Linux syscalls Chrome needs to create processes. Puppeteer will hang/timeout silently.
gcloud run deploy my-service \ --execution-environment gen2 # <-- required for Chrome/Puppeteer
Never deploy a Puppeteer service on gen1.
MUST: Install Chrome System Dependencies
node:18 base image does not include libraries Chrome needs. Missing any one of these causes launch failure.
FROM node:18 RUN apt-get update && apt-get install -y \ ca-certificates fonts-liberation fonts-ipafont-gothic fonts-wqy-zenhei \ libasound2 libatk-bridge2.0-0 libatk1.0-0 libcairo2 libcups2 \ libdbus-1-3 libdrm2 libgbm1 libglib2.0-0 libgtk-3-0 libnspr4 libnss3 \ libpango-1.0-0 libx11-xcb1 libxcb1 libxcomposite1 libxcursor1 \ libxdamage1 libxext6 libxfixes3 libxi6 libxrandr2 libxrender1 \ libxss1 libxtst6 xdg-utils --no-install-recommends \ && rm -rf /var/lib/apt/lists/*
MUST: Secret Manager mount path ngoài WORKDIR
Nếu mount secret vào cùng path với
WORKDIR, volume mount sẽ che toàn bộ thư mục — chỉ còn file secret, code biến mất.
# SAI — mount trùng WORKDIR /app --set-secrets "/app/service-account.json=my-secret:latest" # ĐÚNG — mount ra path riêng --set-secrets "/secrets/service-account.json=my-secret:latest"
ENV GOOGLE_CREDS_PATH=/secrets/service-account.json
Windows/Git Bash warning:
với path Unix sẽ bị Git Bash convert thành Windows path. Set--set-env-varstrực tiếp trong Dockerfile thay vì truyền qua CLI.ENV
Puppeteer Launch Config cho Cloud Run
const browser = await puppeteer.launch({ headless: 'new', timeout: 60000, // cold start cần thời gian — mặc định 30s không đủ args: [ '--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage', // /dev/shm nhỏ trong container '--single-process', '--no-zygote', '--disable-gpu', ], });
waitUntil hợp lệ trong Puppeteer
Chỉ có 4 giá trị hợp lệ —
'commit' (từ Playwright) KHÔNG tồn tại:
| Giá trị | Ý nghĩa |
|---|---|
| Chờ event (chậm nhất) |
| Chờ DOM parse xong (khuyên dùng) |
| Không còn request nào trong 500ms |
| ≤ 2 request đang chờ trong 500ms |
Với trang nặng từ server overseas, dùng
domcontentloaded + timeout cao + waitForFunction chờ content cụ thể:
await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 120000 }); await page.waitForFunction( () => document.body && document.body.innerText.includes('target-text'), { timeout: 60000 } );
Block Resource để Tăng Tốc
Chặn images, CSS, fonts giảm băng thông đáng kể — quan trọng khi crawl từ server overseas:
await page.setRequestInterception(true); page.on('request', (req) => { if (['image', 'stylesheet', 'font', 'media'].includes(req.resourceType())) { req.abort(); } else { req.continue(); } });
Recommended Cloud Run Deploy Command
gcloud run deploy my-service \ --image gcr.io/PROJECT_ID/my-image \ --region asia-southeast1 \ --platform managed \ --execution-environment gen2 \ --memory 2Gi \ --cpu 2 \ --timeout 300 \ --set-secrets "/secrets/service-account.json=my-secret:latest" \ --allow-unauthenticated
Puppeteer cần ít nhất 2Gi RAM — Chrome dùng nhiều memory.