Claude-skill-registry Debugging Issues

Systematically debug issues with reproduction steps, error analysis, hypothesis testing, and root cause fixes. Use when investigating bugs, analyzing production incidents, or troubleshooting unexpected behavior.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/debugging-issues" ~/.claude/skills/majiayu000-claude-skill-registry-debugging-issues && rm -rf "$T"
manifest: skills/data/debugging-issues/SKILL.md
source content

Debugging Issues

Purpose

Provides systematic approaches to debugging, troubleshooting techniques, and error analysis strategies.

When to Use

  • Investigating bugs or unexpected behavior
  • Analyzing error messages and stack traces
  • Troubleshooting system issues
  • Performance debugging
  • Root cause analysis
  • Production incident response

Systematic Debugging Process

1. Reproduce the Issue

Goal: Create a consistent way to trigger the bug

Steps:

  • Document exact steps to reproduce
  • Identify required preconditions
  • Note the environment (OS, browser, versions)
  • Create minimal reproduction case
  • Verify it reproduces consistently

Example:

reproduction_steps:
  - action: "Login as admin user"
  - action: "Navigate to /dashboard"
  - action: "Click 'Export Data' button"
  - expected: "CSV file downloads"
  - actual: "Error 500 appears"
  - frequency: "Occurs every time"

2. Isolate the Problem

Goal: Narrow down where the issue occurs

Techniques:

isolation_methods:
  Divide and Conquer:
    description: "Split system in half, test which half has issue"
    example: "Comment out half the code, see if error persists"

  Binary Search:
    description: "Use git bisect or similar to find breaking commit"
    command: "git bisect start && git bisect bad && git bisect good v1.0"

  Component Isolation:
    description: "Test each component individually"
    example: "Test database, API, frontend separately"

  Environment Comparison:
    description: "Compare working vs broken environments"
    checklist:
      - Different OS?
      - Different versions?
      - Different configurations?
      - Different data?

3. Analyze Logs and Errors

Goal: Gather evidence about what's going wrong

Log Analysis:

log_analysis:
  error_messages:
    - Read the full error message
    - Note the error type/code
    - Identify the failing component

  stack_traces:
    - Start from the bottom (root cause)
    - Identify the first non-library code
    - Check function arguments at that point

  correlation:
    - Check logs before the error
    - Look for patterns
    - Correlate with user actions
    - Check timestamps

Common Error Patterns:

# NullPointerException / AttributeError
# Usually: Accessing property of None/null object
# Fix: Add null checks or ensure object is initialized

# IndexError / ArrayIndexOutOfBoundsException
# Usually: Accessing array index that doesn't exist
# Fix: Check array length before accessing

# KeyError / Property not found
# Usually: Accessing dict/object key that doesn't exist
# Fix: Use .get() with default or check if key exists

# TypeError / Type mismatch
# Usually: Wrong type passed to function
# Fix: Validate types, add type hints

# ConnectionError / Timeout
# Usually: Network issues or service down
# Fix: Add retry logic, check service health

4. Form Hypothesis

Goal: Develop theory about what's causing the issue

Hypothesis Framework:

hypothesis_template:
  observation: "What did you observe?"
  theory: "What do you think is causing it?"
  prediction: "If theory is correct, what else would be true?"
  test: "How can you test this?"

example:
  observation: "API returns 500 error on POST /users"
  theory: "Input validation is rejecting valid email format"
  prediction: "If true, different email format should work"
  test: "Try with various email formats"

5. Test the Hypothesis

Goal: Verify or disprove your theory

Testing Approaches:

testing_methods:
  Add Logging:
    description: "Add detailed logs around suspected area"
    example: |
      logger.debug(f"Input data: {data}")
      logger.debug(f"Validation result: {is_valid}")

  Add Breakpoints:
    description: "Pause execution to inspect state"
    tools:
      - "pdb for Python"
      - "debugger for JavaScript"
      - "gdb for C/C++"

  Change One Thing:
    description: "Modify one variable at a time"
    example: "Change input value, run again, observe result"

  Write Failing Test:
    description: "Create test that reproduces the bug"
    benefit: "Ensures fix works and prevents regression"

6. Implement Fix

Goal: Resolve the root cause

Fix Strategies:

fix_approaches:
  Quick Fix:
    when: "Production is down"
    approach: "Minimal change to restore service"
    followup: "Proper fix later"

  Root Cause Fix:
    when: "Have time to do it right"
    approach: "Fix underlying cause"
    benefit: "Prevents similar bugs"

  Workaround:
    when: "Fix is complex, need temporary solution"
    approach: "Add special handling"
    document: "Explain why workaround exists"

7. Verify the Fix

Goal: Ensure the issue is resolved

Verification Checklist:

  • Original bug is fixed
  • No new bugs introduced
  • All tests pass
  • Edge cases handled
  • Code reviewed
  • Deployed to test environment
  • Tested in production-like environment

Debugging Techniques

Print Debugging

# Simple but effective
def calculate_total(items):
    print(f"DEBUG: items = {items}")
    total = sum(item.price for item in items)
    print(f"DEBUG: total = {total}")
    return total

Interactive Debugging

# Python pdb
import pdb; pdb.set_trace()

# Common commands:
# n (next) - Execute next line
# s (step) - Step into function
# c (continue) - Continue execution
# p variable - Print variable
# l (list) - Show code context
# q (quit) - Exit debugger

Rubber Duck Debugging

rubber_duck_method:
  step_1: "Get a rubber duck (or patient colleague)"
  step_2: "Explain your code line by line"
  step_3: "Explain what you expect to happen"
  step_4: "Explain what actually happens"
  step_5: "Often you'll realize the issue while explaining"

Binary Search Debugging

# Find which commit introduced a bug
git bisect start
git bisect bad  # Current commit is bad
git bisect good v1.0  # v1.0 was working

# Git will checkout commits for you to test
# After each test, mark as good or bad:
git bisect good  # if works
git bisect bad   # if broken

# Git will find the problematic commit

Adding Instrumentation

# Add metrics to understand behavior
import time
from functools import wraps

def timing_decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        duration = time.time() - start
        print(f"{func.__name__} took {duration:.2f}s")
        return result
    return wrapper

@timing_decorator
def slow_function():
    # Your code here
    pass

Common Debugging Scenarios

Performance Issues

performance_debugging:
  profile_the_code:
    python: "python -m cProfile script.py"
    node: "node --prof script.js"

  identify_bottlenecks:
    - Look for functions called many times
    - Check for slow database queries
    - Identify memory allocations

  optimize:
    - Cache repeated calculations
    - Use more efficient algorithms
    - Add database indexes
    - Implement pagination

Memory Leaks

memory_leak_debugging:
  detect:
    - Monitor memory usage over time
    - Look for steadily increasing memory
    - Check for unclosed resources

  common_causes:
    - Unclosed file handles
    - Unclosed database connections
    - Event listeners not removed
    - Circular references
    - Large objects not garbage collected

  fix:
    - Use context managers (with statement)
    - Explicitly close connections
    - Remove event listeners
    - Break circular references

Race Conditions

race_condition_debugging:
  symptoms:
    - Intermittent failures
    - Harder to reproduce
    - Timing-dependent

  detection:
    - Add logging with timestamps
    - Use thread/process IDs in logs
    - Add artificial delays to expose timing issues

  solutions:
    - Add proper locking (mutex, semaphore)
    - Use atomic operations
    - Redesign to avoid shared state
    - Use message queues

Database Issues

database_debugging:
  slow_queries:
    identify: "EXPLAIN ANALYZE query"
    solutions:
      - Add indexes
      - Optimize joins
      - Reduce data fetched
      - Use connection pooling

  deadlocks:
    detect: "Check database logs for deadlock errors"
    prevent:
      - Acquire locks in consistent order
      - Keep transactions short
      - Use appropriate isolation levels

  connection_issues:
    symptoms: "Connection refused, timeout errors"
    check:
      - Database is running
      - Connection string correct
      - Firewall/network allows connection
      - Connection pool not exhausted

Error Analysis Patterns

Stack Trace Reading

# Example stack trace
Traceback (most recent call last):
  File "app.py", line 45, in main
    process_user(user_data)
  File "services.py", line 23, in process_user
    validate_email(user_data['email'])
  File "validators.py", line 12, in validate_email
    if '@' not in email:
TypeError: argument of type 'NoneType' is not iterable

# Analysis:
# 1. Error: TypeError at line 12 in validators.py
# 2. Cause: 'email' variable is None
# 3. Origin: Likely user_data['email'] is None from services.py line 23
# 4. Fix: Add None check before validation

Error Messages Interpretation

error_interpretation:
  "Connection refused":
    likely_causes:
      - Service not running
      - Wrong port
      - Firewall blocking

  "Permission denied":
    likely_causes:
      - Insufficient file permissions
      - User lacks required role
      - Protected resource

  "Resource not found":
    likely_causes:
      - Typo in path/URL
      - Resource deleted
      - Wrong environment

  "Timeout":
    likely_causes:
      - Service too slow
      - Network issues
      - Infinite loop
      - Deadlock

Debugging Checklist

Before Starting

  • Can you reproduce the issue?
  • Do you have access to logs?
  • Do you have a test environment?
  • Is there a recent change that might have caused it?

During Debugging

  • Have you isolated the problem area?
  • Have you checked the logs?
  • Have you formed a hypothesis?
  • Have you tested your hypothesis?
  • Are you changing one thing at a time?

Before Closing

  • Is the original issue fixed?
  • Have you written a test for this bug?
  • Have you checked for similar bugs?
  • Have you documented the root cause?
  • Have you shared knowledge with the team?

Production Debugging

Safe Debugging in Production

production_debugging:
  do:
    - Add detailed logging
    - Monitor metrics
    - Use feature flags to isolate issues
    - Take snapshots/backups before changes
    - Have rollback plan ready

  dont:
    - Don't use debugger breakpoints (freezes service)
    - Don't make changes without review
    - Don't restart services unnecessarily
    - Don't expose sensitive data in logs

Incident Response

incident_response:
  immediate:
    - Assess severity
    - Notify stakeholders
    - Start incident log
    - Begin mitigation

  mitigation:
    - Restore service (rollback if needed)
    - Implement workaround
    - Monitor closely

  resolution:
    - Identify root cause
    - Implement proper fix
    - Test thoroughly
    - Deploy fix

  followup:
    - Write postmortem
    - Update runbooks
    - Add monitoring/alerts
    - Share learnings

Tools and Resources

Debugging Tools

tools_by_language:
  python:
    - "pdb - Interactive debugger"
    - "ipdb - Enhanced pdb"
    - "memory_profiler - Memory profiling"
    - "cProfile - Performance profiling"

  javascript:
    - "Chrome DevTools"
    - "Node.js debugger"
    - "VS Code debugger"

  general:
    - "Git bisect - Find breaking commit"
    - "curl - Test APIs"
    - "tcpdump - Network debugging"
    - "strace/dtrace - System call tracing"

Use this skill when debugging issues or conducting root cause analysis