Skills-4-SE code-instrumentation-generator

Automatically instruments source code to collect runtime information such as function calls, branch decisions, variable values, and execution traces while preserving original program semantics. Use when users need to: (1) Add logging or tracing to code for debugging, (2) Collect runtime execution data for analysis, (3) Monitor function calls and control flow, (4) Track variable values during execution, (5) Generate execution traces for testing or profiling. Supports Python, Java, JavaScript, and C/C++ with configurable instrumentation levels.

install

source · Clone the upstream repo

git clone https://github.com/ArabelaTso/Skills-4-SE

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ArabelaTso/Skills-4-SE "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/code-instrumentation-generator" ~/.claude/skills/arabelatso-skills-4-se-code-instrumentation-generator && rm -rf "$T"

manifest: skills/code-instrumentation-generator/SKILL.md

source content

Code Instrumentation Generator

Automatically instrument source code to collect runtime information while preserving program semantics.

Workflow

Follow these steps to instrument code:

1. Analyze the Source Code

Understand the code structure and identify instrumentation points:

Language detection: Identify the programming language
Code structure: Parse functions, classes, branches, loops
Entry/exit points: Locate function boundaries
Control flow: Identify branches (if/else, switch, loops)
Variable scope: Understand variable declarations and usage

2. Determine Instrumentation Strategy

Choose appropriate instrumentation based on requirements:

Instrumentation levels:

Function-level: Entry/exit of functions with parameters and return values
Branch-level: Execution of conditional branches (if/else, switch cases)
Statement-level: Individual statement execution
Variable-level: Variable assignments and value changes

Configuration options:

Enable/disable specific instrumentation types
Filter by function names or file patterns
Set verbosity level
Choose output format (logs, JSON, CSV)

3. Insert Instrumentation Code

Add instrumentation hooks at identified points:

Function instrumentation:

Insert entry hook at function start
Capture function name, parameters, timestamp
Insert exit hook before returns
Capture return value, execution time

Branch instrumentation:

Insert hooks at branch conditions
Record which branch was taken
Track branch coverage

Variable instrumentation:

Insert hooks after variable assignments
Capture variable name and value
Track value changes over time

4. Ensure Semantic Preservation

Verify that instrumentation doesn't change program behavior:

No side effects: Instrumentation code doesn't modify program state
Exception safety: Instrumentation handles exceptions properly
Performance: Minimal overhead added
Thread safety: Instrumentation is safe in concurrent code

5. Generate Output

Provide instrumented code and documentation:

Instrumented source code: Modified code with instrumentation
Probe description: Documentation of inserted instrumentation points
Configuration file: Settings to enable/disable instrumentation
Usage instructions: How to run and collect data

Language-Specific Patterns

Python

# Original code
def calculate_sum(a, b):
    result = a + b
    return result

# Instrumented code
import logging
logging.basicConfig(level=logging.INFO)

def calculate_sum(a, b):
    # Function entry instrumentation
    logging.info(f"ENTER calculate_sum(a={a}, b={b})")

    result = a + b
    # Variable instrumentation
    logging.info(f"VAR result={result}")

    # Function exit instrumentation
    logging.info(f"EXIT calculate_sum() -> {result}")
    return result

Java

// Original code
public int calculateSum(int a, int b) {
    int result = a + b;
    return result;
}

// Instrumented code
public int calculateSum(int a, int b) {
    // Function entry instrumentation
    System.out.println("ENTER calculateSum(a=" + a + ", b=" + b + ")");

    int result = a + b;
    // Variable instrumentation
    System.out.println("VAR result=" + result);

    // Function exit instrumentation
    System.out.println("EXIT calculateSum() -> " + result);
    return result;
}

JavaScript

// Original code
function calculateSum(a, b) {
    const result = a + b;
    return result;
}

// Instrumented code
function calculateSum(a, b) {
    // Function entry instrumentation
    console.log(`ENTER calculateSum(a=${a}, b=${b})`);

    const result = a + b;
    // Variable instrumentation
    console.log(`VAR result=${result}`);

    // Function exit instrumentation
    console.log(`EXIT calculateSum() -> ${result}`);
    return result;
}

C/C++

// Original code
int calculate_sum(int a, int b) {
    int result = a + b;
    return result;
}

// Instrumented code
#include <stdio.h>

int calculate_sum(int a, int b) {
    // Function entry instrumentation
    printf("ENTER calculate_sum(a=%d, b=%d)\n", a, b);

    int result = a + b;
    // Variable instrumentation
    printf("VAR result=%d\n", result);

    // Function exit instrumentation
    printf("EXIT calculate_sum() -> %d\n", result);
    return result;
}

Branch Instrumentation Example

# Original code
def check_value(x):
    if x > 0:
        return "positive"
    else:
        return "non-positive"

# Instrumented code
def check_value(x):
    logging.info(f"ENTER check_value(x={x})")

    # Branch instrumentation
    if x > 0:
        logging.info("BRANCH if(x > 0) -> TRUE")
        result = "positive"
    else:
        logging.info("BRANCH if(x > 0) -> FALSE")
        result = "non-positive"

    logging.info(f"EXIT check_value() -> {result}")
    return result

Configuration-Based Instrumentation

Generate a configuration file to control instrumentation:

# instrumentation_config.py
INSTRUMENTATION_ENABLED = True
INSTRUMENT_FUNCTIONS = True
INSTRUMENT_BRANCHES = True
INSTRUMENT_VARIABLES = False
LOG_LEVEL = "INFO"
OUTPUT_FORMAT = "text"  # or "json", "csv"

# Instrumented code with configuration
import instrumentation_config as config

def calculate_sum(a, b):
    if config.INSTRUMENT_FUNCTIONS:
        logging.info(f"ENTER calculate_sum(a={a}, b={b})")

    result = a + b

    if config.INSTRUMENT_VARIABLES:
        logging.info(f"VAR result={result}")

    if config.INSTRUMENT_FUNCTIONS:
        logging.info(f"EXIT calculate_sum() -> {result}")

    return result

Output Format

Probe Description Document

## Instrumentation Report

**File**: calculator.py
**Instrumentation Date**: 2024-02-17
**Configuration**: Function-level + Branch-level

### Instrumented Functions

1. **calculate_sum(a, b)**
   - Entry probe: Line 3
   - Exit probe: Line 8
   - Captures: Parameters (a, b), return value

2. **check_value(x)**
   - Entry probe: Line 11
   - Branch probe: Line 14 (if x > 0)
   - Exit probe: Line 19
   - Captures: Parameter (x), branch decision, return value

### Instrumentation Statistics
- Total functions instrumented: 2
- Total branches instrumented: 1
- Total variables instrumented: 0
- Estimated overhead: <5%

### Usage
Run the instrumented code normally. Instrumentation output will be written to:
- Console (stdout)
- Log file: instrumentation.log (if configured)

Best Practices

Minimize overhead: Only instrument what's necessary
Use conditional compilation: Allow disabling instrumentation in production
Handle exceptions: Ensure instrumentation doesn't crash the program
Preserve semantics: Never modify program logic
Thread-safe logging: Use thread-safe logging mechanisms
Structured output: Use consistent format for easy parsing
Timestamp everything: Include timestamps for temporal analysis

Advanced Features

Selective Instrumentation

# Only instrument specific functions
INSTRUMENTED_FUNCTIONS = ["calculate_sum", "process_data"]

def should_instrument(func_name):
    return func_name in INSTRUMENTED_FUNCTIONS

# Apply instrumentation conditionally
if should_instrument("calculate_sum"):
    # Add instrumentation
    pass

Performance Monitoring

import time

def calculate_sum(a, b):
    start_time = time.time()
    logging.info(f"ENTER calculate_sum(a={a}, b={b})")

    result = a + b

    elapsed = time.time() - start_time
    logging.info(f"EXIT calculate_sum() -> {result} [time={elapsed:.6f}s]")
    return result

JSON Output Format

import json
import time

def calculate_sum(a, b):
    entry_event = {
        "type": "function_entry",
        "function": "calculate_sum",
        "params": {"a": a, "b": b},
        "timestamp": time.time()
    }
    print(json.dumps(entry_event))

    result = a + b

    exit_event = {
        "type": "function_exit",
        "function": "calculate_sum",
        "return_value": result,
        "timestamp": time.time()
    }
    print(json.dumps(exit_event))

    return result

Constraints

Preserve semantics: Never change program behavior
Minimal overhead: Keep instrumentation lightweight
No side effects: Instrumentation shouldn't modify program state
Exception safety: Handle errors gracefully
Configurable: Allow enabling/disabling instrumentation