Claude-Skills aws-solution-architect

install
source · Clone the upstream repo
git clone https://github.com/borghei/Claude-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/borghei/Claude-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/engineering/aws-solution-architect" ~/.claude/skills/borghei-claude-skills-aws-solution-architect && rm -rf "$T"
manifest: engineering/aws-solution-architect/SKILL.md
source content

AWS Solution Architect

Design scalable, cost-effective AWS architectures for startups with infrastructure-as-code templates.


Table of Contents


Trigger Terms

Use this skill when you encounter:

CategoryTerms
Architecture Designserverless architecture, AWS architecture, cloud design, microservices, three-tier
IaC GenerationCloudFormation, CDK, Terraform, infrastructure as code, deploy template
ServerlessLambda, API Gateway, DynamoDB, Step Functions, EventBridge, AppSync
ContainersECS, Fargate, EKS, container orchestration, Docker on AWS
Cost Optimizationreduce AWS costs, optimize spending, right-sizing, Savings Plans
DatabaseAurora, RDS, DynamoDB design, database migration, data modeling
SecurityIAM policies, VPC design, encryption, Cognito, WAF
CI/CDCodePipeline, CodeBuild, CodeDeploy, GitHub Actions AWS
MonitoringCloudWatch, X-Ray, observability, alarms, dashboards
Migrationmigrate to AWS, lift and shift, replatform, DMS

Workflow

Step 1: Gather Requirements

Collect application specifications:

- Application type (web app, mobile backend, data pipeline, SaaS)
- Expected users and requests per second
- Budget constraints (monthly spend limit)
- Team size and AWS experience level
- Compliance requirements (GDPR, HIPAA, SOC 2)
- Availability requirements (SLA, RPO/RTO)

Step 2: Design Architecture

Run the architecture designer to get pattern recommendations:

python scripts/architecture_designer.py --input requirements.json

Select from recommended patterns:

  • Serverless Web: S3 + CloudFront + API Gateway + Lambda + DynamoDB
  • Event-Driven Microservices: EventBridge + Lambda + SQS + Step Functions
  • Three-Tier: ALB + ECS Fargate + Aurora + ElastiCache
  • GraphQL Backend: AppSync + Lambda + DynamoDB + Cognito

See

references/architecture_patterns.md
for detailed pattern specifications.

Step 3: Generate IaC Templates

Create infrastructure-as-code for the selected pattern:

# Serverless stack (CloudFormation)
python scripts/serverless_stack.py --app-name my-app --region us-east-1

# Output: CloudFormation YAML template ready to deploy

Step 4: Review Costs

Analyze estimated costs and optimization opportunities:

python scripts/cost_optimizer.py --resources current_setup.json --monthly-spend 2000

Output includes:

  • Monthly cost breakdown by service
  • Right-sizing recommendations
  • Savings Plans opportunities
  • Potential monthly savings

Step 5: Deploy

Deploy the generated infrastructure:

# CloudFormation
aws cloudformation create-stack \
  --stack-name my-app-stack \
  --template-body file://template.yaml \
  --capabilities CAPABILITY_IAM

# CDK
cdk deploy

# Terraform
terraform init && terraform apply

Step 6: Validate

Verify deployment and set up monitoring:

# Check stack status
aws cloudformation describe-stacks --stack-name my-app-stack

# Set up CloudWatch alarms
aws cloudwatch put-metric-alarm --alarm-name high-errors ...

Tools

architecture_designer.py

Generates architecture patterns based on requirements.

python scripts/architecture_designer.py --input requirements.json --output design.json

Input: JSON with app type, scale, budget, compliance needs Output: Recommended pattern, service stack, cost estimate, pros/cons

serverless_stack.py

Creates serverless CloudFormation templates.

python scripts/serverless_stack.py --app-name my-app --region us-east-1

Output: Production-ready CloudFormation YAML with:

  • API Gateway + Lambda
  • DynamoDB table
  • Cognito user pool
  • IAM roles with least privilege
  • CloudWatch logging

cost_optimizer.py

Analyzes costs and recommends optimizations.

python scripts/cost_optimizer.py --resources inventory.json --monthly-spend 5000

Output: Recommendations for:

  • Idle resource removal
  • Instance right-sizing
  • Reserved capacity purchases
  • Storage tier transitions
  • NAT Gateway alternatives

Quick Start

MVP Architecture (< $100/month)

Ask: "Design a serverless MVP backend for a mobile app with 1000 users"

Result:
- Lambda + API Gateway for API
- DynamoDB pay-per-request for data
- Cognito for authentication
- S3 + CloudFront for static assets
- Estimated: $20-50/month

Scaling Architecture ($500-2000/month)

Ask: "Design a scalable architecture for a SaaS platform with 50k users"

Result:
- ECS Fargate for containerized API
- Aurora Serverless for relational data
- ElastiCache for session caching
- CloudFront for CDN
- CodePipeline for CI/CD
- Multi-AZ deployment

Cost Optimization

Ask: "Optimize my AWS setup to reduce costs by 30%. Current spend: $3000/month"

Provide: Current resource inventory (EC2, RDS, S3, etc.)

Result:
- Idle resource identification
- Right-sizing recommendations
- Savings Plans analysis
- Storage lifecycle policies
- Target savings: $900/month

IaC Generation

Ask: "Generate CloudFormation for a three-tier web app with auto-scaling"

Result:
- VPC with public/private subnets
- ALB with HTTPS
- ECS Fargate with auto-scaling
- Aurora with read replicas
- Security groups and IAM roles

Input Requirements

Provide these details for architecture design:

RequirementDescriptionExample
Application typeWhat you're buildingSaaS platform, mobile backend
Expected scaleUsers, requests/sec10k users, 100 RPS
BudgetMonthly AWS limit$500/month max
Team contextSize, AWS experience3 devs, intermediate
ComplianceRegulatory needsHIPAA, GDPR, SOC 2
AvailabilityUptime requirements99.9% SLA, 1hr RPO

JSON Format:

{
  "application_type": "saas_platform",
  "expected_users": 10000,
  "requests_per_second": 100,
  "budget_monthly_usd": 500,
  "team_size": 3,
  "aws_experience": "intermediate",
  "compliance": ["SOC2"],
  "availability_sla": "99.9%"
}

Output Formats

Architecture Design

  • Pattern recommendation with rationale
  • Service stack diagram (ASCII)
  • Configuration specifications
  • Monthly cost estimate
  • Scaling characteristics
  • Trade-offs and limitations

IaC Templates

  • CloudFormation YAML: Production-ready SAM/CFN templates
  • CDK TypeScript: Type-safe infrastructure code
  • Terraform HCL: Multi-cloud compatible configs

Cost Analysis

  • Current spend breakdown
  • Optimization recommendations with savings
  • Priority action list (high/medium/low)
  • Implementation checklist

Reference Documentation

DocumentContents
references/architecture_patterns.md
6 patterns: serverless, microservices, three-tier, data processing, GraphQL, multi-region
references/service_selection.md
Decision matrices for compute, database, storage, messaging
references/best_practices.md
Serverless design, cost optimization, security hardening, scalability

Limitations

  • Lambda: 15-minute execution, 10GB memory max
  • API Gateway: 29-second timeout, 10MB payload
  • DynamoDB: 400KB item size, eventually consistent by default
  • Regional availability varies by service
  • Some services have AWS-specific lock-in

Troubleshooting

ProblemCauseSolution
Lambda cold starts exceed 500msFunction package too large or VPC-attached Lambda without provisioned concurrencyReduce deployment package size, use Lambda layers, enable provisioned concurrency for latency-sensitive endpoints, or move to Fargate for consistent performance
CloudFormation stack stuck in
ROLLBACK_IN_PROGRESS
Resource creation failed mid-deploy and rollback also failed (e.g., non-empty S3 bucket)Check CloudFormation events for the root cause, manually delete the blocking resource, then delete the stack; use
DeletionPolicy: Retain
for stateful resources
Monthly AWS bill significantly exceeds estimateUntagged resources, forgotten dev/staging environments, or NAT Gateway data transfer costsEnable Cost Explorer, set up AWS Budgets with 50%/80%/100% alerts, run
cost_optimizer.py
against current inventory, and audit resources with missing tags
DynamoDB throttling errors (ProvisionedThroughputExceededException)Read/write capacity insufficient for traffic spikes, or hot partition keySwitch to on-demand billing mode, redesign partition key for even distribution, or enable DynamoDB Auto Scaling with appropriate min/max settings
API Gateway returns 504 Gateway TimeoutBackend Lambda or integration exceeds the 29-second API Gateway limitOptimize Lambda execution time, offload long tasks to Step Functions or SQS, increase Lambda memory (which also increases CPU), or use asynchronous invocation patterns
Cross-region replication lag causes stale readsDynamoDB Global Tables or Aurora Global Database replication latency under heavy write loadDesign for eventual consistency, route reads to the write-primary region for strong consistency, or use conflict resolution strategies documented in
references/architecture_patterns.md
IAM permission denied errors after deploymentLeast-privilege policies missing required actions, or trust policy not updated for new servicesReview CloudTrail logs for denied API calls, add the specific missing actions to the IAM policy, and validate with IAM Policy Simulator before deploying

Success Criteria

  • Cost accuracy: Monthly AWS bill stays within 10% of the architecture estimate produced by
    cost_optimizer.py
    .
  • Availability: Production workloads meet or exceed the target SLA (99.9% uptime for three-tier, 99.95% for multi-region).
  • Recovery time: RTO under 4 hours and RPO under 1 hour for all production architectures with disaster recovery configured.
  • Deployment speed: Infrastructure provisioned from generated IaC templates in under 30 minutes for serverless stacks and under 60 minutes for three-tier stacks.
  • Security posture: Zero critical findings in AWS Security Hub within 30 days of deployment; all resources encrypted at rest and in transit.
  • Scaling response: Auto-scaling responds to traffic spikes within 2 minutes, handling 10x baseline load without manual intervention.
  • Operational overhead: Team spends less than 4 hours per week on infrastructure operations after initial deployment.

Scope & Limitations

This skill covers:

  • AWS architecture design for startups and growth-stage companies (serverless, three-tier, microservices, data pipelines, IoT, multi-region patterns)
  • Infrastructure-as-code generation for CloudFormation (SAM), CDK (TypeScript), and Terraform (HCL)
  • Cost analysis, right-sizing recommendations, and Savings Plans evaluation
  • Service selection guidance for compute, database, storage, networking, and security

This skill does NOT cover:

  • Multi-cloud or hybrid-cloud architectures (Azure, GCP) -- see
    engineering/cloud-migration-specialist/
    for cross-cloud strategies
  • Application-level code, business logic, or framework-specific implementation -- see
    engineering/senior-fullstack/
    for fullstack development
  • Compliance audit execution or regulatory evidence collection -- see
    ra-qm-team/
    for SOC 2, HIPAA, GDPR, and ISO compliance skills
  • AWS account management, organization policies, or billing disputes -- see AWS Support or
    engineering/ms365-tenant-manager/
    for tenant administration patterns

Integration Points

SkillIntegrationData Flow
engineering/senior-devops
CI/CD pipeline configuration for deploying generated IaC templatesArchitecture templates flow into DevOps deployment pipelines and monitoring setup
engineering/senior-secops
Security hardening of generated architectures (IAM policies, WAF rules, GuardDuty)Architecture design feeds into security review; SecOps findings feed back as architecture constraints
ra-qm-team/soc2-compliance
Compliance validation of AWS architectures against SOC 2 Trust Services CriteriaArchitecture resource inventory feeds into compliance audit; audit findings drive architecture changes
engineering/senior-backend
Backend service implementation that runs on the designed AWS infrastructureArchitecture patterns define the runtime environment; backend requirements inform service selection
engineering/tech-stack-evaluator
Technology selection decisions that influence architecture pattern choiceStack evaluation outputs (database, compute, messaging choices) feed into architecture requirements JSON
c-level-advisor/cto-advisor
Strategic infrastructure decisions, build-vs-buy, and cloud budget planningCost analysis from
cost_optimizer.py
informs CTO budget decisions; CTO constraints flow back as architecture requirements

Tool Reference

architecture_designer.py

Purpose: Generates architecture pattern recommendations based on application requirements. Analyzes app type, expected scale, budget, team experience, and compliance needs to recommend the optimal AWS architecture pattern with full service configurations and cost estimates.

Usage:

from scripts.architecture_designer import ArchitectureDesigner

designer = ArchitectureDesigner(requirements)
pattern = designer.recommend_architecture_pattern()
checklist = designer.generate_service_checklist()

Constructor Parameters:

ParameterTypeRequiredDefaultDescription
requirements
dict
Yes--Dictionary containing all application requirements (see fields below)

Requirements Dictionary Fields:

FieldTypeDefaultDescription
application_type
str
"web_application"
One of:
web_application
,
mobile_backend
,
data_pipeline
,
microservices
,
saas_platform
,
iot_platform
expected_users
int
1000
Expected number of users (or devices for IoT)
requests_per_second
int
10
Expected peak requests per second
budget_monthly_usd
float
500
Maximum monthly AWS budget in USD
team_size
int
3
Number of engineers on the team
aws_experience
str
"beginner"
Team AWS experience level
compliance
list
[]
List of compliance frameworks (e.g.,
["SOC2", "HIPAA"]
)
data_size_gb
int
10
Expected data volume in GB

Methods:

MethodReturnsDescription
recommend_architecture_pattern()
dict
Returns recommended pattern with services, cost estimate, pros/cons, and scaling characteristics
generate_service_checklist()
list[dict]
Returns phased implementation checklist (Planning, Foundation, Core Services, Security, Monitoring, CI/CD)

Example:

from scripts.architecture_designer import ArchitectureDesigner

requirements = {
    "application_type": "saas_platform",
    "expected_users": 10000,
    "requests_per_second": 100,
    "budget_monthly_usd": 500,
    "team_size": 3,
    "aws_experience": "intermediate",
    "compliance": ["SOC2"],
    "data_size_gb": 50
}

designer = ArchitectureDesigner(requirements)
result = designer.recommend_architecture_pattern()
print(result['pattern_name'])       # "Serverless Web Application"
print(result['estimated_cost'])     # {"monthly_usd": ..., "breakdown": {...}}
print(result['services'])           # Full service stack with configurations

Output Format: Returns a dictionary with keys:

pattern_name
,
description
,
use_case
,
services
(nested service configurations),
estimated_cost
(with
monthly_usd
and
breakdown
),
pros
,
cons
, and
scaling_characteristics
.

Supported Patterns:

  • Serverless Web Application (< 10k users)
  • Modern Three-Tier Application (10k-100k users)
  • Multi-Region High Availability (100k+ users)
  • Serverless Mobile Backend (mobile app type)
  • Event-Driven Microservices (microservices type)
  • Real-Time Data Pipeline (data pipeline type)
  • IoT Platform (IoT type)

serverless_stack.py

Purpose: Generates production-ready infrastructure-as-code templates for serverless applications. Produces CloudFormation (SAM), CDK (TypeScript), and Terraform (HCL) configurations with API Gateway, Lambda, DynamoDB, Cognito, IAM roles, and CloudWatch logging preconfigured.

Usage:

from scripts.serverless_stack import ServerlessStackGenerator

generator = ServerlessStackGenerator(app_name, requirements)
cfn_template = generator.generate_cloudformation_template()
cdk_stack = generator.generate_cdk_stack()
terraform_config = generator.generate_terraform_configuration()

Constructor Parameters:

ParameterTypeRequiredDefaultDescription
app_name
str
Yes--Application name (used for resource naming; auto-lowercased, spaces replaced with hyphens)
requirements
dict
Yes--Dictionary with deployment requirements (see fields below)

Requirements Dictionary Fields:

FieldTypeDefaultDescription
region
str
"us-east-1"
AWS region for deployment

Methods:

MethodReturnsDescription
generate_cloudformation_template()
str
YAML CloudFormation/SAM template with DynamoDB, Lambda, API Gateway, Cognito, IAM, and CloudWatch
generate_cdk_stack()
str
TypeScript CDK stack with equivalent resources
generate_terraform_configuration()
str
Terraform HCL configuration with equivalent resources

Example:

from scripts.serverless_stack import ServerlessStackGenerator

generator = ServerlessStackGenerator("my-saas-app", {"region": "us-west-2"})

# Generate CloudFormation template
cfn = generator.generate_cloudformation_template()
with open("template.yaml", "w") as f:
    f.write(cfn)

# Generate CDK stack
cdk = generator.generate_cdk_stack()
with open("lib/stack.ts", "w") as f:
    f.write(cdk)

# Generate Terraform config
tf = generator.generate_terraform_configuration()
with open("main.tf", "w") as f:
    f.write(tf)

Output Format: Each method returns a string containing the full IaC template. Templates include: DynamoDB table (single-table design with PK/SK), Lambda function (Node.js 18.x, 512 MB, 10s timeout), API Gateway (REST, Cognito auth, CORS, throttling), Cognito User Pool (email sign-in, optional MFA), IAM roles (least privilege), and CloudWatch log group (7-day retention). All templates output: API URL, User Pool ID, User Pool Client ID, and Table Name.


cost_optimizer.py

Purpose: Analyzes current AWS resource inventory and spending to generate prioritized cost optimization recommendations. Evaluates compute (EC2, Lambda), storage (S3), databases (RDS, DynamoDB), networking (NAT Gateway, VPC endpoints), and general optimizations (CloudWatch Logs, Elastic IPs, budget alerts).

Usage:

from scripts.cost_optimizer import CostOptimizer

optimizer = CostOptimizer(current_resources, monthly_spend)
analysis = optimizer.analyze_and_optimize()
checklist = optimizer.generate_optimization_checklist()

Constructor Parameters:

ParameterTypeRequiredDefaultDescription
current_resources
dict
Yes--Dictionary describing current AWS resources (see fields below)
monthly_spend
float
Yes--Current monthly AWS spend in USD

Resources Dictionary Fields:

FieldTypeDescription
ec2_instances
list[dict]
EC2 instances with
cpu_utilization
(%),
pricing
(
"on-demand"
or
"reserved"
)
lambda_functions
list[dict]
Lambda functions with
memory_mb
,
avg_memory_used_mb
s3_buckets
list[dict]
S3 buckets with
name
,
size_gb
,
storage_class
,
has_lifecycle_policy
(bool)
rds_instances
list[dict]
RDS instances with
name
,
connections_per_day
,
monthly_cost
,
engine
,
utilization
(%)
dynamodb_tables
list[dict]
DynamoDB tables with
name
,
billing_mode
,
read_capacity_units
,
write_capacity_units
,
utilization_percentage
nat_gateways
list[dict]
NAT Gateway resources
multi_az_required
bool
Whether multi-AZ NAT is required
vpc_endpoints
list
Existing VPC endpoints
s3_data_transfer_gb
float
Monthly S3 data transfer volume in GB
cloudwatch_log_groups
list[dict]
Log groups with
name
,
retention_days
(
-1
for never expire),
size_gb
elastic_ips
list[dict]
Elastic IPs with
attached
(bool)
has_budget_alerts
bool
Whether AWS Budgets are configured
has_cost_explorer
bool
Whether Cost Explorer is enabled

Methods:

MethodReturnsDescription
analyze_and_optimize()
dict
Full cost analysis with current spend, potential savings, optimized spend, savings percentage, recommendations list, and top 5 priority actions
generate_optimization_checklist()
list[dict]
Phased action checklist: Immediate (today), This Week, This Month, Ongoing

Example:

from scripts.cost_optimizer import CostOptimizer

resources = {
    "ec2_instances": [
        {"cpu_utilization": 5, "pricing": "on-demand"},
        {"cpu_utilization": 65, "pricing": "on-demand"}
    ],
    "s3_buckets": [
        {"name": "app-assets", "size_gb": 200, "storage_class": "STANDARD", "has_lifecycle_policy": False}
    ],
    "nat_gateways": [{"id": "nat-1"}, {"id": "nat-2"}],
    "multi_az_required": False,
    "has_budget_alerts": False,
    "has_cost_explorer": False
}

optimizer = CostOptimizer(resources, monthly_spend=3000)
result = optimizer.analyze_and_optimize()

print(f"Current spend: ${result['current_monthly_spend']}")
print(f"Potential savings: ${result['potential_monthly_savings']}")
print(f"Savings: {result['savings_percentage']}%")
for rec in result['priority_actions']:
    print(f"  [{rec['priority']}] {rec['service']}: {rec['recommendation']}")

Output Format:

analyze_and_optimize()
returns a dictionary with keys:
current_monthly_spend
(float),
potential_monthly_savings
(float),
optimized_monthly_spend
(float),
savings_percentage
(float),
recommendations
(list of dicts with
service
,
type
,
issue
,
recommendation
,
potential_savings
,
priority
), and
priority_actions
(top 5 high-priority recommendations sorted by savings).