Gsd-skill-creator platform-engineering

Provides platform engineering best practices for Internal Developer Platforms (IDPs), golden paths, service catalogs, and developer experience. Use when building developer platforms, configuring Backstage, designing self-service workflows, or when user mentions 'platform engineering', 'backstage', 'golden path', 'IDP', 'developer portal', 'service catalog', 'DevEx', 'platform team', 'self-service'.

install
source · Clone the upstream repo
git clone https://github.com/Tibsfox/gsd-skill-creator
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/examples/skills/patterns/platform-engineering" ~/.claude/skills/tibsfox-gsd-skill-creator-platform-engineering && rm -rf "$T"
manifest: examples/skills/patterns/platform-engineering/SKILL.md
source content

Platform Engineering

Best practices for building Internal Developer Platforms (IDPs) that reduce cognitive load, accelerate delivery, and create golden paths for development teams.

IDP Architecture Layers

A well-designed IDP separates concerns into distinct layers. Each layer abstracts complexity from the one above it.

Developer Interface (Portal / CLI / API)
        |
  Orchestration Layer (Workflows, Templates, Scaffolding)
        |
  Integration Layer (APIs, Plugins, Connectors)
        |
  Resource Layer (Infrastructure, Services, Tools)
LayerPurposeComponentsOwned By
Developer InterfaceSelf-service entry pointBackstage portal, CLI tools, API gatewayPlatform team
OrchestrationWorkflow automation, templatingScaffolder, Terraform modules, CrossplanePlatform team
IntegrationConnect tools and servicesBackstage plugins, API adapters, webhooksPlatform + tool owners
ResourceActual infrastructure and servicesKubernetes, databases, CI/CD, monitoringInfrastructure team
GovernancePolicy enforcement and complianceOPA, Kyverno, cost policies, security scansSecurity + platform team

Platform Team Topology and Responsibilities

Team Structure

RoleResponsibilityFocus Area
Platform Product ManagerRoadmap, prioritization, user researchDeveloper needs, adoption metrics
Platform EngineerIDP core, golden paths, automationInfrastructure abstraction, tooling
Developer AdvocateDocumentation, onboarding, feedback loopsDevEx, training, communication
SRE/Reliability LeadPlatform reliability, SLOs, incident responseUptime, performance, observability
Security EngineerPolicy-as-code, compliance automationGuardrails, scanning, access control

Interaction Model

Stream-Aligned Teams (consumers)
        |
        | self-service requests
        v
Platform Team (enablers)
        |
        | golden paths, templates, APIs
        v
Infrastructure / Cloud (resources)

Platform teams operate as enabling teams (Team Topologies model). They reduce cognitive load on stream-aligned teams by providing curated, opinionated abstractions.

Backstage: Service Catalog and Developer Portal

catalog-info.yaml -- Service Registration

Every service registers itself in the catalog via a

catalog-info.yaml
at the repo root.

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-service
  description: Handles payment processing and refunds
  annotations:
    github.com/project-slug: acme-corp/payment-service
    backstage.io/techdocs-ref: dir:.
    pagerduty.com/service-id: P1234ABC
    grafana/dashboard-selector: "payment-service"
  tags:
    - java
    - spring-boot
    - payments
  links:
    - url: https://grafana.internal/d/payments
      title: Dashboard
      icon: dashboard
    - url: https://runbooks.internal/payments
      title: Runbook
      icon: docs
spec:
  type: service
  lifecycle: production
  owner: team-payments
  system: checkout-system
  providesApis:
    - payment-api
  consumesApis:
    - inventory-api
    - notification-api
  dependsOn:
    - resource:payments-db
    - component:auth-service

---
apiVersion: backstage.io/v1alpha1
kind: API
metadata:
  name: payment-api
  description: Payment processing REST API
spec:
  type: openapi
  lifecycle: production
  owner: team-payments
  system: checkout-system
  definition:
    $text: ./openapi.yaml

Service Catalog API -- Querying Components

# List all components owned by a team
curl -s "https://backstage.internal/api/catalog/entities?filter=kind=component,spec.owner=team-payments" \
  -H "Authorization: Bearer $BACKSTAGE_TOKEN" | jq '.[] | {name: .metadata.name, lifecycle: .spec.lifecycle}'

# Find all services consuming a specific API
curl -s "https://backstage.internal/api/catalog/entities?filter=kind=component,spec.consumesApis=payment-api" \
  -H "Authorization: Bearer $BACKSTAGE_TOKEN" | jq '.[] | .metadata.name'

# Get component details with relations
curl -s "https://backstage.internal/api/catalog/entities/by-name/component/default/payment-service" \
  -H "Authorization: Bearer $BACKSTAGE_TOKEN" | jq '{
    name: .metadata.name,
    owner: .spec.owner,
    apis: .spec.providesApis,
    dependencies: .spec.dependsOn
  }'

Golden Path Templates

Golden paths are opinionated, pre-configured templates that encode best practices. They give teams a paved road to production.

Backstage Software Template

apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: spring-boot-service
  title: Spring Boot Microservice
  description: Creates a production-ready Spring Boot service with CI/CD, monitoring, and database
  tags:
    - java
    - spring-boot
    - recommended
spec:
  owner: platform-team
  type: service

  parameters:
    - title: Service Details
      required:
        - name
        - owner
        - description
      properties:
        name:
          title: Service Name
          type: string
          pattern: '^[a-z][a-z0-9-]*$'
          ui:autofocus: true
        owner:
          title: Owner Team
          type: string
          ui:field: OwnerPicker
          ui:options:
            catalogFilter:
              kind: Group
        description:
          title: Description
          type: string
        javaVersion:
          title: Java Version
          type: string
          default: '21'
          enum: ['17', '21']

    - title: Infrastructure
      properties:
        database:
          title: Database
          type: string
          default: postgresql
          enum: [postgresql, mysql, none]
        cacheLayer:
          title: Cache Layer
          type: string
          default: none
          enum: [redis, none]
        messageBroker:
          title: Message Broker
          type: string
          default: none
          enum: [kafka, rabbitmq, none]

  steps:
    - id: fetch-template
      name: Fetch Skeleton
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          owner: ${{ parameters.owner }}
          description: ${{ parameters.description }}
          javaVersion: ${{ parameters.javaVersion }}
          database: ${{ parameters.database }}

    - id: create-repo
      name: Create Repository
      action: publish:github
      input:
        repoUrl: github.com?owner=acme-corp&repo=${{ parameters.name }}
        description: ${{ parameters.description }}
        defaultBranch: main
        protectDefaultBranch: true
        requireCodeOwnerReviews: true

    - id: register-catalog
      name: Register in Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps['create-repo'].output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml

    - id: create-argocd-app
      name: Create ArgoCD Application
      action: argocd:create-resources
      input:
        appName: ${{ parameters.name }}
        repoUrl: ${{ steps['create-repo'].output.remoteUrl }}

  output:
    links:
      - title: Repository
        url: ${{ steps['create-repo'].output.remoteUrl }}
      - title: Service in Catalog
        url: ${{ steps['register-catalog'].output.entityRef }}
      - title: CI/CD Pipeline
        url: ${{ steps['create-repo'].output.remoteUrl }}/actions

Golden Path Coverage Matrix

CategoryWhat the Golden Path ProvidesWithout Golden Path
RepositoryPre-configured with CI/CD, linting, CODEOWNERSManual setup, inconsistent configs
CI/CDWorking pipeline from day oneCopy-paste from other repos, broken configs
ObservabilityDashboards, alerts, SLOs pre-configuredNo monitoring until first incident
SecurityDependency scanning, SAST, secrets detectionAdded retroactively (if at all)
DocumentationADR template, README structure, API docsEmpty README, no docs
InfrastructureTerraform modules, Kubernetes manifestsHand-crafted YAML, drift between envs
TestingTest framework, coverage gates, fixturesAd-hoc test setup, no coverage requirements

Developer Experience Metrics (SPACE Framework)

Measure platform effectiveness using the SPACE framework. Never rely on a single dimension.

DimensionWhat It MeasuresExample MetricsCollection Method
SatisfactionHow developers feel about the platformNPS score, satisfaction survey (1-5)Quarterly survey
PerformanceOutcome of developer workDeployment frequency, change failure rateDORA metrics pipeline
ActivityVolume of actionsScaffolding requests, API calls, portal visitsPlatform telemetry
CommunicationQuality of collaborationTime to first response on platform supportTicketing system
EfficiencyFlow and minimal frictionTime from commit to deploy, onboarding timePipeline metrics

Key DevEx Metrics Dashboard

# Platform DevEx Metrics -- collected via platform telemetry
metrics:
  onboarding:
    time_to_first_deploy:
      target: "< 2 hours"
      description: "Time from new hire to first successful deployment"
      source: "scaffolder + pipeline timestamps"

    time_to_first_commit:
      target: "< 4 hours"
      description: "Time from repo creation to first merged commit"
      source: "github events"

  self_service:
    template_adoption_rate:
      target: "> 80%"
      description: "Percentage of new services using golden path templates"
      source: "backstage scaffolder logs"

    self_service_resolution_rate:
      target: "> 70%"
      description: "Percentage of requests resolved without platform team intervention"
      source: "support tickets vs portal actions"

  reliability:
    platform_availability:
      target: "99.9%"
      description: "Uptime of developer portal, CI/CD, and artifact registry"
      source: "synthetic monitoring"

    mean_time_to_recovery:
      target: "< 30 minutes"
      description: "Time to restore platform services after incident"
      source: "incident management system"

  delivery:
    deployment_frequency:
      target: "multiple per day per team"
      description: "How often teams deploy to production"
      source: "deployment pipeline events"

    lead_time_for_changes:
      target: "< 1 day"
      description: "Time from commit to production"
      source: "git + pipeline timestamps"

Self-Service Portal Workflow

Request Flow Architecture

Developer submits request via Portal UI
        |
        v
Request Validation (schema check, policy check)
        |
        v
Approval Gate (if required by policy)
        |           |
        | auto      | manual
        v           v
Orchestration Engine (executes workflow)
        |
        +---> Provision Infrastructure (Terraform/Crossplane)
        +---> Configure CI/CD (GitHub Actions / ArgoCD)
        +---> Register in Catalog (Backstage)
        +---> Set Up Monitoring (Grafana / PagerDuty)
        +---> Notify Team (Slack / Email)
        |
        v
Verification (health checks, smoke tests)
        |
        v
Developer notified -- ready to use

Self-Service Capability Matrix

CapabilityAutomation LevelApproval RequiredTypical Time
Create new serviceFully automatedNo5 minutes
Provision databaseFully automatedNo (dev/staging), Yes (prod)10 minutes
Add CI/CD pipelineFully automatedNo2 minutes
Request cloud credentialsSemi-automatedYes (security review)1 hour
Create new environmentFully automatedNo (non-prod), Yes (prod)15 minutes
Add monitoring/alertsFully automatedNo5 minutes
Resize infrastructureSemi-automatedYes (cost review > threshold)30 minutes
Decommission serviceAutomated with safeguardsYes (owner confirmation)10 minutes

Platform Engineering Maturity Model

LevelNameCharacteristicsCapabilities
0Ad HocNo platform, tribal knowledgeTeams manage their own infra
1ReactiveShared scripts, wiki docsBasic CI/CD, manual provisioning
2StandardizedGolden paths, basic portalService templates, catalog, basic self-service
3OptimizedFull IDP, metrics-drivenSelf-service everything, DevEx metrics, policy-as-code
4StrategicPlatform as product, innovationAPI-first platform, marketplace, continuous feedback

Maturity Assessment Checklist

Level 1 --> Level 2:
  [x] Service catalog exists and is maintained
  [x] At least 3 golden path templates available
  [x] Basic developer portal deployed
  [x] CI/CD standardized across teams

Level 2 --> Level 3:
  [x] Self-service for >80% of common requests
  [x] SPACE metrics collected and reviewed monthly
  [x] Policy-as-code enforced (not advisory)
  [x] Platform team has dedicated product manager
  [x] Internal SLOs defined for platform services

Level 3 --> Level 4:
  [x] API-first platform (all capabilities programmable)
  [x] Internal developer marketplace for plugins/extensions
  [x] Continuous developer experience research program
  [x] Platform economics model (cost attribution per team)
  [x] Platform contributes to organizational strategy

Anti-Patterns

Anti-PatternProblemFix
Build it and they will comeNo adoption without developer inputTreat platform as product; user research before building
Ticket-ops disguised as platformSelf-service portal that just creates ticketsAutomate end-to-end; tickets are a smell, not a solution
Mandating platform useForced adoption breeds resentment and workaroundsMake the golden path the easiest path, not the only path
One-size-fits-all templatesOverly rigid templates that don't fit team needsComposable templates with sensible defaults and escape hatches
No feedback loopsPlatform team builds in isolationRegular surveys, office hours, embedded rotations with teams
Ignoring developer experienceTechnically correct but painful to useMeasure DevEx metrics, optimize for developer happiness
Platform team as bottleneckAll changes go through platform teamSelf-service with guardrails; teams should not wait on platform
Over-abstracting too earlyComplex abstraction layers before understanding needsStart with concrete solutions, abstract when patterns emerge
Neglecting documentationPowerful platform nobody knows how to useDocs-as-code, TechDocs in Backstage, examples for everything
No platform SLOsPlatform reliability treated as best-effortDefine and publish SLOs; platform is a product with SLAs
Shadow platformsTeams build their own tooling around the platformUnderstand why and address gaps; shadow platforms reveal unmet needs
Gold plating the portalSpending months on portal UI before delivering valueShip incrementally; a working CLI beats a beautiful but empty portal

Platform Engineering Checklist

  • Platform team established with clear product ownership
  • Developer portal deployed (Backstage or equivalent)
  • Service catalog populated with all production services
  • At least 3 golden path templates available and documented
  • Self-service provisioning for common infrastructure (databases, queues, caches)
  • CI/CD pipelines standardized and available via templates
  • Observability stack integrated (dashboards auto-created with new services)
  • Security scanning built into golden paths (not bolted on after)
  • DevEx metrics defined and collected (SPACE framework dimensions)
  • Feedback mechanism active (surveys, office hours, Slack channel)
  • Platform SLOs defined and monitored
  • Documentation maintained in developer portal (TechDocs)
  • Onboarding time measured and optimized (target: first deploy < 2 hours)
  • Cost visibility per team/service available through platform
  • Platform roadmap published and informed by developer feedback
  • Escape hatches documented for when golden paths don't fit
  • Platform reliability meets or exceeds published SLOs