Agent-almanac build-ci-cd-pipeline
git clone https://github.com/pjt222/agent-almanac
T=$(mktemp -d) && git clone --depth=1 https://github.com/pjt222/agent-almanac "$T" && mkdir -p ~/.claude/skills && cp -r "$T/i18n/zh-CN/skills/build-ci-cd-pipeline" ~/.claude/skills/pjt222-agent-almanac-build-ci-cd-pipeline-93e13d && rm -rf "$T"
i18n/zh-CN/skills/build-ci-cd-pipeline/SKILL.md构建 CI/CD 流水线
使用 GitHub Actions 设计和实现生产级持续集成和部署流水线。
适用场景
- 为新项目设置自动化测试和部署
- 从 Jenkins、Travis CI 或 CircleCI 迁移到 GitHub Actions
- 实现跨多个平台或语言版本的矩阵构建
- 添加构建缓存以加速 CI/CD 执行时间
- 创建具有环境特定部署的多阶段流水线
- 实现安全扫描和代码质量门禁
输入
- 必需:包含需要测试/构建/部署代码的仓库
- 必需:GitHub Actions 工作流目录(
).github/workflows/ - 可选:部署目标的密钥(AWS、Azure、Docker 仓库)
- 可选:用于特殊构建的自托管运行器配置
- 可选:分支保护规则和必需的状态检查
步骤
第 1 步:创建基本工作流结构
创建
.github/workflows/ci.yml,包含触发器配置和基本任务结构。
name: CI Pipeline on: push: branches: [main, develop] pull_request: branches: [main, develop] workflow_dispatch: # Manual trigger env: NODE_VERSION: '18' REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }} jobs: lint: name: Lint Code runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: ${{ env.NODE_VERSION }} cache: 'npm' - name: Install dependencies run: npm ci - name: Run ESLint run: npm run lint - name: Check formatting run: npm run format:check
预期结果: 工作流文件以正确的 YAML 语法创建,触发器已配置,基本的代码检查任务已定义。
失败处理: 使用
yamllint .github/workflows/ci.yml 验证 YAML 语法。检查缩进(使用空格而非制表符)。通过 GitHub Marketplace 检查 action 版本是否为最新。
第 2 步:实现矩阵构建策略
添加矩阵构建以在多个平台、语言版本或配置间进行测试。
test: name: Test (${{ matrix.os }}, Node ${{ matrix.node }}) runs-on: ${{ matrix.os }} needs: lint strategy: fail-fast: false # Continue testing other matrix combinations on failure matrix: os: [ubuntu-latest, windows-latest, macos-latest] node: ['16', '18', '20'] exclude: - os: macos-latest node: '16' # Skip old Node on macOS steps: - uses: actions/checkout@v4 - name: Setup Node.js ${{ matrix.node }} uses: actions/setup-node@v4 with: node-version: ${{ matrix.node }} cache: 'npm' - name: Install dependencies run: npm ci - name: Run tests with coverage run: npm run test:coverage - name: Upload coverage to Codecov uses: codecov/codecov-action@v3 if: matrix.os == 'ubuntu-latest' && matrix.node == '18' with: token: ${{ secrets.CODECOV_TOKEN }} files: ./coverage/lcov.info fail_ci_if_error: true
预期结果: 矩阵生成 8 个并行任务(3 个操作系统 x 3 个 Node 版本 - 1 个排除项)。所有测试跨平台通过。覆盖率报告从单个规范任务上传。
失败处理: 如果矩阵语法出错,验证缩进和数组表示法是否正确。对于不稳定的测试,使用
uses: nick-invision/retry@v2 添加重试逻辑。对于平台特定的失败,添加操作系统条件或扩展排除项。
第 3 步:配置依赖缓存和制品管理
通过智能缓存优化构建速度,并保留构建制品。
build: name: Build Application runs-on: ubuntu-latest needs: test steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: ${{ env.NODE_VERSION }} cache: 'npm' - name: Cache build output uses: actions/cache@v3 with: path: | .next/cache dist/ build/ key: ${{ runner.os }}-build-${{ hashFiles('**/package-lock.json') }}-${{ hashFiles('**/*.ts', '**/*.tsx') }} restore-keys: | ${{ runner.os }}-build-${{ hashFiles('**/package-lock.json') }}- ${{ runner.os }}-build- - name: Install dependencies run: npm ci - name: Build application run: npm run build env: NODE_ENV: production - name: Upload build artifacts uses: actions/upload-artifact@v3 with: name: dist-${{ github.sha }} path: | dist/ build/ retention-days: 7 if-no-files-found: error
预期结果: 首次运行下载依赖(较慢),后续运行从缓存恢复(较快)。构建制品以基于 SHA 的唯一命名成功上传。
失败处理: 如果缓存频繁未命中,验证缓存键是否包含所有相关文件的哈希值。对于上传失败,检查路径是否存在以及通配符模式是否匹配实际构建输出。验证
retention-days 是否符合组织策略。
第 4 步:实现安全扫描和质量门禁
添加安全漏洞扫描和代码质量强制检查。
security: name: Security Scan runs-on: ubuntu-latest needs: lint permissions: security-events: write # Required for uploading SARIF results steps: - uses: actions/checkout@v4 - name: Run Trivy vulnerability scanner uses: aquasecurity/trivy-action@master with: scan-type: 'fs' scan-ref: '.' format: 'sarif' output: 'trivy-results.sarif' severity: 'CRITICAL,HIGH' - name: Upload Trivy results to GitHub Security uses: github/codeql-action/upload-sarif@v2 if: always() # Upload even if scan finds vulnerabilities with: sarif_file: 'trivy-results.sarif' - name: Dependency audit run: npm audit --audit-level=high continue-on-error: true # Don't fail build, but show warnings - name: Check for leaked secrets uses: trufflesecurity/trufflehog@main with: path: ./ base: ${{ github.event.repository.default_branch }} head: HEAD
预期结果: 安全扫描完成,结果上传到 GitHub 安全选项卡。如果配置了分支保护,关键漏洞将阻止合并。提交中未检测到泄露的密钥。
失败处理: 对于误报,创建
.trivyignore 文件并附上 CVE ID 和说明。对于审计失败,查看 npm audit fix 的建议。对于密钥检测的误报,将模式添加到 .trufflehog.yml 的排除列表中。
第 5 步:配置环境特定部署
设置具有环境保护规则和审批门禁的部署阶段。
deploy-staging: name: Deploy to Staging runs-on: ubuntu-latest needs: [build, security] if: github.ref == 'refs/heads/develop' environment: name: staging url: https://staging.example.com steps: - name: Download build artifacts uses: actions/download-artifact@v3 with: name: dist-${{ github.sha }} path: ./dist - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: ${{ secrets.AWS_ROLE_STAGING }} aws-region: us-east-1 - name: Deploy to S3 run: | aws s3 sync ./dist s3://${{ secrets.S3_BUCKET_STAGING }} --delete aws cloudfront create-invalidation --distribution-id ${{ secrets.CF_DIST_STAGING }} --paths "/*" deploy-production: name: Deploy to Production runs-on: ubuntu-latest needs: [build, security] if: github.ref == 'refs/heads/main' environment: name: production url: https://example.com steps: - name: Download build artifacts uses: actions/download-artifact@v3 with: name: dist-${{ github.sha }} path: ./dist - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: role-to-assume: ${{ secrets.AWS_ROLE_PRODUCTION }} aws-region: us-east-1 - name: Deploy to S3 with blue-green run: | # Deploy to new version aws s3 sync ./dist s3://${{ secrets.S3_BUCKET_PRODUCTION }}/releases/${{ github.sha }} --delete # Update symlink to new version aws s3 cp s3://${{ secrets.S3_BUCKET_PRODUCTION }}/releases/${{ github.sha }} s3://${{ secrets.S3_BUCKET_PRODUCTION }}/current --recursive # Invalidate CloudFront aws cloudfront create-invalidation --distribution-id ${{ secrets.CF_DIST_PRODUCTION }} --paths "/*" - name: Create GitHub Release uses: softprops/action-gh-release@v1 if: startsWith(github.ref, 'refs/tags/') with: files: ./dist/**/* generate_release_notes: true
预期结果: 预发布环境在 develop 分支上自动部署。生产环境需要手动审批(在 GitHub 环境设置中配置)。CloudFront 失效清除 CDN 缓存。带标签的提交创建发布版本。
失败处理: 对于 AWS 凭证错误,验证 OIDC 信任关系允许
role-to-assume。对于 S3 同步失败,检查存储桶策略和 IAM 权限。对于环境审批问题,验证设置 > 环境中的保护规则。
第 6 步:添加通知和监控集成
集成 Slack 通知、部署追踪和性能监控。
notify: name: Notify Results runs-on: ubuntu-latest needs: [deploy-staging, deploy-production] if: always() # Run even if previous jobs fail steps: - name: Check job status id: status run: | if [ "${{ needs.deploy-production.result }}" == "success" ]; then echo "status=success" >> $GITHUB_OUTPUT echo "color=#00FF00" >> $GITHUB_OUTPUT else echo "status=failure" >> $GITHUB_OUTPUT echo "color=#FF0000" >> $GITHUB_OUTPUT fi - name: Send Slack notification uses: slackapi/slack-github-action@v1.24.0 with: payload: | { "text": "Deployment ${{ steps.status.outputs.status }}", "blocks": [ { "type": "header", "text": { "type": "plain_text", "text": "Deployment Status: ${{ steps.status.outputs.status }}" } }, { "type": "section", "fields": [ {"type": "mrkdwn", "text": "*Repository:*\n${{ github.repository }}"}, {"type": "mrkdwn", "text": "*Branch:*\n${{ github.ref_name }}"}, {"type": "mrkdwn", "text": "*Commit:*\n${{ github.sha }}"}, {"type": "mrkdwn", "text": "*Actor:*\n${{ github.actor }}"} ] }, { "type": "actions", "elements": [ { "type": "button", "text": {"type": "plain_text", "text": "View Workflow"}, "url": "${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}" } ] } ] } env: SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }} SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK - name: Record deployment in Datadog if: steps.status.outputs.status == 'success' run: | curl -X POST "https://api.datadoghq.com/api/v1/events" \ -H "Content-Type: application/json" \ -H "DD-API-KEY: ${{ secrets.DD_API_KEY }}" \ -d @- <<EOF { "title": "Deployment: ${{ github.repository }}", "text": "Deployed commit ${{ github.sha }} to production", "tags": ["env:production", "service:${{ github.event.repository.name }}"], "alert_type": "info" } EOF
预期结果: Slack 收到格式化的通知,包含部署状态、仓库详情和可点击的工作流链接。Datadog 为成功的生产部署记录事件并附带适当的标签。
失败处理: 对于 Slack 失败,验证 webhook URL 是否有效以及工作区是否允许传入 webhook。使用
curl -X POST $SLACK_WEBHOOK_URL -d '{"text":"test"}' 进行测试。对于 Datadog 失败,验证 API 密钥是否具有事件提交权限。
验证清单
- 工作流语法通过
或 GitHub 工作流编辑器验证yamllint - 所有任务都有显式依赖关系(
)以控制执行顺序needs: - 矩阵构建覆盖所有目标平台和版本
- 缓存使后续运行的构建时间减少 >50%
- 密钥存储在 GitHub Secrets 中,从不硬编码在工作流文件中
- 安全扫描结果上传到 GitHub 安全选项卡
- 环境保护规则要求生产部署审批
- 失败的部署不会使系统处于不一致状态
- 通知到达适当的渠道(Slack、邮件、监控工具)
- 工作流对典型变更在 10 分钟内完成
常见问题
-
缓存键过于宽泛:使用
作为缓存键会导致依赖变更时的误命中。在键中包含${{ runner.os }}-build-hashFiles('**/package-lock.json') -
制品名称冲突:使用静态制品名称如
会导致并发构建时的覆盖。在名称中包含dist
或${{ github.sha }}${{ matrix.os }}-${{ matrix.node }} -
日志中暴露密钥:避免
或类似命令。GitHub 会遮蔽已注册的密钥,但派生值可能泄露。对动态密钥使用echo $SECRET::add-mask:: -
权限不足:默认
权限有限。为安全事件、包、问题等添加显式GITHUB_TOKEN
块permissions: -
缺少 if 条件:任务在所有触发器上运行,除非使用
进行保护。防止 PR 意外触发生产部署if: github.ref == 'refs/heads/main' -
缺少回滚策略:部署失败使系统处于损坏状态。实现蓝绿部署或金丝雀部署,在健康检查失败时自动回滚
-
硬编码值:工作流包含环境特定的 URL、存储桶名称或 API 端点。使用环境变量和 GitHub Secrets
-
缺少超时限制:任务在网络问题或无限循环时无限期挂起。为所有任务添加
timeout-minutes: 15
相关技能
— R 包和基础项目的初始 GitHub Actions 配置setup-github-actions-ci
— 与 CI/CD 触发器的正确 Git 工作流集成commit-changes
— 仓库设置和分支保护规则configure-git-repository
— CI/CD 流水线中的 Docker 镜像构建setup-container-registry
— ArgoCD/Flux 与 CI/CD 的集成implement-gitops-workflow