Claude-skill-registry Binary Analysis and Reverse Engineering
Systematic approach to analyzing compiled binaries, understanding program behavior, and identifying vulnerabilities without source code access
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/binary-analysis" ~/.claude/skills/majiayu000-claude-skill-registry-binary-analysis-and-reverse-engineering && rm -rf "$T"
skills/data/binary-analysis/SKILL.mdBinary Analysis and Reverse Engineering
Overview
Binary analysis examines compiled executables without access to source code. This skill combines static analysis (disassembly, decompilation) with dynamic analysis (debugging, tracing) to understand program behavior, identify vulnerabilities, and reverse engineer functionality.
Core principle: Combine static and dynamic analysis. Static reveals structure; dynamic reveals behavior.
When to Use
- Analyzing closed-source software for vulnerabilities
- Malware analysis and understanding
- Reverse engineering proprietary protocols
- Understanding third-party libraries or dependencies
- CTF challenges and security research
- Verifying security claims of binary-only software
Analysis Workflow
Phase 1: Initial Assessment
Goal: Understand what you're analyzing and gather basic information.
# File type and architecture file binary # ELF 64-bit LSB executable, x86-64, dynamically linked # Strings (quick insight into functionality) strings binary | less # Dependencies ldd binary # Check for linked libraries # Security features checksec binary # RELRO, Stack Canary, NX, PIE, FORTIFY # Basic metadata readelf -h binary # ELF headers objdump -p binary # Program headers
Phase 2: Static Analysis
Goal: Understand program structure without execution.
Disassembly
# Linear disassembly objdump -d binary > disassembly.txt # Intelligent disassembly with Ghidra # 1. Import binary into Ghidra # 2. Analyze with default options # 3. Review function list, strings, cross-references # IDA Pro (commercial but powerful) # - Advanced decompilation # - Cross-references # - Graph view
Function Analysis
# Using radare2 """ $ r2 binary [0x00001000]> aa # Analyze all [0x00001000]> afl # List functions [0x00001000]> pdf @ main # Disassemble main [0x00001000]> VV @ main # Visual graph mode """ # Common patterns to look for: # - Entry point and main function # - Vulnerable functions (strcpy, gets, sprintf) # - Cryptographic operations # - Network operations (socket, connect, send) # - File operations (fopen, read, write) # - Privilege operations (setuid, system)
Control Flow Analysis
""" Analyze program flow: 1. Identify entry point 2. Follow execution paths 3. Identify decision points (if/else, switch) 4. Map loops and recursion 5. Identify error handling 6. Find return points Key questions: - What are the main code paths? - Where does user input enter? - What validation occurs? - Where are dangerous operations? """
Phase 3: Dynamic Analysis
Goal: Observe actual program behavior during execution.
Debugging with GDB
# Start GDB gdb ./binary # Set breakpoints (gdb) break main (gdb) break *0x401234 # Specific address # Run with arguments (gdb) run arg1 arg2 # Examine registers (gdb) info registers # Examine memory (gdb) x/10x $rsp # 10 hex words at stack pointer (gdb) x/s 0x404000 # String at address # Step through code (gdb) stepi # Step one instruction (gdb) nexti # Step over function call (gdb) continue # Continue to next breakpoint # Display on each step (gdb) display/i $pc # Show current instruction (gdb) display/x $rax # Show RAX register
Enhanced GDB with PEDA/GEF/pwndbg
# Install PEDA git clone https://github.com/longld/peda.git ~/peda echo "source ~/peda/peda.py" >> ~/.gdbinit # Or GEF (recommended) bash -c "$(curl -fsSL https://gef.blah.cat/sh)" # Enhanced features: # - Color coding # - Automatic display of registers, stack, code # - Pattern creation/offset calculation # - ROP gadget search # - Heap analysis
System Call Tracing
# Trace system calls strace ./binary # Trace with details strace -v -s 1024 ./binary # Trace specific syscalls strace -e trace=open,read,write ./binary # Trace library calls ltrace ./binary # Follow child processes strace -f ./binary
Dynamic Instrumentation
# Using Frida for runtime instrumentation import frida import sys def on_message(message, data): print(f"[*] {message}") # Attach to process session = frida.attach("target_process") # JavaScript to inject script_code = """ Interceptor.attach(Module.findExportByName(null, 'strcmp'), { onEnter: function(args) { console.log('[*] strcmp called'); console.log(' arg1: ' + Memory.readUtf8String(args[0])); console.log(' arg2: ' + Memory.readUtf8String(args[1])); }, onLeave: function(retval) { console.log(' return: ' + retval); } }); """ script = session.create_script(script_code) script.on('message', on_message) script.load() sys.stdin.read()
Phase 4: Vulnerability Identification
Common Vulnerability Patterns:
Buffer Overflow
; Look for unsafe string operations call strcpy ; No bounds checking call gets ; Always unsafe call sprintf ; No bounds checking ; Check for: ; - Fixed-size buffers ; - User-controlled input ; - No size validation
Format String
; User input directly to printf mov rdi, [user_input] call printf ; Dangerous if user_input has format specifiers
Use After Free
; Pattern: call free ; Free memory ; ... later ... mov rax, [ptr] ; Use freed pointer
Integer Overflow
; Look for: ; - Size calculations ; - Loop counters ; - Memory allocation sizes imul eax, [count], 8 ; Can overflow call malloc ; Allocates wrong size
Phase 5: Exploit Development
See skills/exploitation/exploit-dev-workflow for detailed exploitation process.
Tool Ecosystem
Disassemblers/Decompilers
Ghidra (Free)
# NSA's reverse engineering suite # Features: # - Decompiler (C-like output) # - Cross-references # - Scripting (Python/Java) # - Collaborative analysis # Download from: https://ghidra-sre.org/
IDA Pro (Commercial)
- Industry standard
- Best-in-class decompiler (Hex-Rays)
- Extensive plugin ecosystem
- IDA Free available with limitations
Binary Ninja (Commercial)
- Modern UI
- Medium-level IL
- Good API for automation
- Active development
radare2 (Free)
# Command-line focused # Steep learning curve but powerful # Basic workflow r2 binary [0x00001000]> aa # Analyze [0x00001000]> afl # Functions [0x00001000]> s main # Seek to main [0x00001000]> pdf # Disassemble [0x00001000]> VV # Visual graph
Debuggers
GDB with Extensions
- PEDA - Python Exploit Development Assistance
- GEF - GDB Enhanced Features
- pwndbg - Exploit development focused
WinDbg (Windows)
- Microsoft's debugger
- Kernel and user-mode debugging
- Essential for Windows analysis
x64dbg (Windows)
- Modern UI
- Plugin support
- Good for malware analysis
Dynamic Analysis
Frida
- Dynamic instrumentation
- JavaScript API
- Cross-platform
- Runtime modification
PIN/DynamoRIO
- Dynamic binary instrumentation frameworks
- Research-grade tools
- Performance analysis
Valgrind
# Memory debugging valgrind --leak-check=full ./binary # Memory profiling valgrind --tool=massif ./binary
Common Analysis Scenarios
Scenario 1: Finding Hardcoded Credentials
# Search strings strings binary | grep -i "password\|secret\|key" # In Ghidra: # 1. Window -> Defined Strings # 2. Search for interesting patterns # 3. Check cross-references to see usage
Scenario 2: Understanding Network Protocol
# 1. Trace network calls strace -e trace=network ./binary # 2. Analyze send/recv calls in disassembly # 3. Set breakpoints on socket operations gdb ./binary (gdb) break send (gdb) break recv # 4. Capture actual packets tcpdump -i lo -w capture.pcap # 5. Analyze protocol structure wireshark capture.pcap
Scenario 3: Bypassing License Check
# 1. Search for error messages strings binary | grep -i "license\|trial\|expired" # 2. Find string references in Ghidra # 3. Understand check logic # 4. Identify bypass point # Options: # - Patch binary (change jump condition) # - Hook function at runtime (Frida) # - Modify return value in debugger
Scenario 4: Extracting Encryption Keys
# Dynamic approach - hook crypto functions """ Interceptor.attach(Module.findExportByName('libcrypto.so', 'AES_set_encrypt_key'), { onEnter: function(args) { console.log('[*] AES key:'); console.log(hexdump(args[0], { length: 32 })); } }); """ # Static approach - look for key initialization # - Search for crypto constants (S-boxes, magic numbers) # - Analyze key derivation functions # - Check for embedded keys
Practical Tips
Naming Conventions
# In disassemblers, rename variables/functions for clarity # Bad: FUN_00401234(local_10, DAT_00404000) # Good: validate_input(user_buffer, key_string) # Document as you analyze # Add comments explaining complex logic # Create structure definitions for data
Cross-Reference Analysis
# Follow data flow: # 1. Find interesting data (strings, constants) # 2. Find references (where it's used) # 3. Understand context of use # 4. Trace back to source # Example: Password validation # "Invalid password" string # -> Used in check_password() # -> Called from login() # -> Gets input from get_user_input()
Identifying Compiler Artifacts
; Stack canary check (GCC) mov rax, fs:0x28 mov [rbp-0x8], rax ; ... function body ... mov rdx, [rbp-0x8] xor rdx, fs:0x28 je .no_corruption call __stack_chk_fail ; C++ name mangling _ZN6Class14memberFunctionEi ; Class1::memberFunction(int)
Anti-Analysis Techniques
Detection:
- Debugger detection (ptrace, IsDebuggerPresent)
- Timing checks
- Code obfuscation
- Anti-disassembly tricks
Countermeasures:
# Patch debugger checks # In GDB: (gdb) break ptrace (gdb) return 0 # Use stealthy debuggers # - ScyllaHide (plugin) # - Custom tools # Deobfuscation # - Symbolic execution (angr) # - Dynamic unpacking # - Pattern matching
Common Pitfalls
| Mistake | Impact | Solution |
|---|---|---|
| Only static analysis | Miss runtime behavior | Combine static + dynamic |
| Not documenting findings | Lose context | Take detailed notes |
| Analyzing without goal | Waste time | Define specific objectives |
| Ignoring cross-references | Miss important connections | Follow all references |
| Not checking compiler version | Misinterpret artifacts | Identify compiler/flags used |
Integration with Other Skills
- skills/analysis/zero-day-hunting - Finding vulnerabilities in binaries
- skills/exploitation/exploit-dev-workflow - Exploiting discovered flaws
- skills/analysis/static-vuln-analysis - Source code analysis if available
Legal and Ethical Considerations
Authorization Required:
- Only analyze authorized software
- Respect license agreements
- Don't distribute cracked software
- Follow responsible disclosure
Legitimate Use Cases:
- Security research with permission
- Malware analysis for defense
- Own software assessment
- Educational purposes with legal samples
Success Metrics
- Understanding program functionality
- Identifying security vulnerabilities
- Extracting useful intelligence
- Creating working exploits (if authorized)
- Comprehensive documentation
References and Further Reading
- "Practical Reverse Engineering" by Dang et al.
- "The IDA Pro Book" by Chris Eagle
- "Practical Binary Analysis" by Dennis Andriesse
- Ghidra documentation and training materials
- Malware analysis books (for dynamic analysis techniques)
- Assembly language references (Intel manuals, x86-64 ABI)
- CTF write-ups for practical examples