Claude-skill-registry log-analysis
Analyze application and system logs to identify errors, patterns, and root causes. Use log aggregation tools and structured logging for effective debugging.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/log-analysis" ~/.claude/skills/majiayu000-claude-skill-registry-log-analysis && rm -rf "$T"
manifest:
skills/data/log-analysis/SKILL.mdsource content
Log Analysis
Overview
Logs are critical for debugging and monitoring. Effective log analysis quickly identifies issues and enables root cause analysis.
When to Use
- Troubleshooting errors
- Performance investigation
- Security incident analysis
- Auditing user actions
- Monitoring application health
Instructions
1. Structured Logging
// Good: Structured logs (machine-readable) logger.info({ level: 'INFO', timestamp: '2024-01-15T10:30:00Z', service: 'auth-service', user_id: '12345', action: 'user_login', status: 'success', duration_ms: 150, ip_address: '192.168.1.1' }); // Bad: Unstructured logs (hard to parse) console.log('User 12345 logged in successfully in 150ms from 192.168.1.1'); // JSON Format (Elasticsearch friendly) { "@timestamp": "2024-01-15T10:30:00Z", "level": "ERROR", "service": "api-gateway", "trace_id": "abc123", "message": "Database connection failed", "error": { "type": "ConnectionError", "code": "ECONNREFUSED" }, "context": { "database": "users", "operation": "SELECT" } }
2. Log Levels & Patterns
Log Levels: DEBUG: Detailed diagnostic info - Variable values - Function entry/exit - Intermediate calculations - Use: Development only INFO: General informational messages - Startup/shutdown - User actions - Configuration changes - Use: Production (normal operations) WARN: Warning messages (potential issues) - Deprecated API usage - Performance degradation - Resource limits approaching - Use: Production (investigate soon) ERROR: Error conditions - Failed operations - Exceptions - Failed requests - Use: Production (action required) FATAL/CRITICAL: System unusable - Critical failures - Out of memory - Data corruption - Use: Production (immediate action) --- Log Patterns: Request Logging: - Request ID (trace_id) - Method + Path - Status code - Duration - Request size / response size Error Logging: - Error type/code - Error message - Stack trace - Context (user_id, session_id) - Timestamp Business Events: - Event type - User involved - Impact/importance - Timestamp - Relevant context
3. Log Analysis Tools
Log Aggregation: ELK Stack (Elasticsearch, Logstash, Kibana): - Logstash: Parse and process logs - Elasticsearch: Search and analyze - Kibana: Visualization and dashboards - Use: Large scale, complex queries Splunk: - Comprehensive log management - Real-time search and analysis - Dashboards and alerts - Use: Enterprise (expensive) CloudWatch (AWS): - Integrated with AWS services - Log Insights for querying - Dashboards - Use: AWS-based systems Datadog: - Application performance monitoring - Log management - Real-time alerts - Use: SaaS monitoring --- Log Analysis Techniques: Grep/Awk: grep "ERROR" app.log awk '{print $1, $4}' app.log Filtering: Filter by timestamp Filter by service Filter by error type Filter by user Searching: Search for error patterns Search for user actions Search trace IDs Search IP addresses Aggregation: Count occurrences Group by error type Calculate duration percentiles Rate of errors over time
4. Common Log Analysis Queries
Find errors in past hour: timestamp: last_1h AND level: ERROR Track user activity: user_id: 12345 AND action: * Find slow requests: duration_ms: >1000 AND level: INFO Analyze error rate by service: level: ERROR | stats count by service Find failed database operations: error.type: "DatabaseError" | stats count Trace request flow: trace_id: "abc123" | sort by timestamp --- Checklist: [ ] Structured logging implemented [ ] All errors logged with context [ ] Request IDs/trace IDs used [ ] Sensitive data not logged (passwords, tokens) [ ] Log levels used appropriately [ ] Log retention policy set [ ] Log sampling for high-volume events [ ] Alerts configured for errors [ ] Dashboards created [ ] Regular log review scheduled [ ] Log analysis tools accessible [ ] Team trained on querying logs
Key Points
- Use structured JSON logging
- Include trace IDs for request tracking
- Log appropriate levels (DEBUG/INFO/ERROR)
- Never log sensitive data (passwords, tokens)
- Aggregate logs centrally
- Create dashboards for key metrics
- Alert on error rates and critical issues
- Retain logs appropriately
- Search logs by trace ID for troubleshooting
- Review logs regularly for patterns