Anthropic-Cybersecurity-Skills performing-firmware-extraction-with-binwalk
'Performs firmware image extraction and analysis using binwalk to identify embedded filesystems, compressed archives,
git clone https://github.com/mukul975/Anthropic-Cybersecurity-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/mukul975/Anthropic-Cybersecurity-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/performing-firmware-extraction-with-binwalk" ~/.claude/skills/mukul975-anthropic-cybersecurity-skills-performing-firmware-extraction-with-binw && rm -rf "$T"
skills/performing-firmware-extraction-with-binwalk/SKILL.mdPerforming Firmware Extraction with Binwalk
When to Use
- Analyzing IoT device firmware downloaded from vendor sites or extracted from flash chips
- Reverse engineering router, camera, or embedded device firmware for vulnerability research
- Identifying embedded filesystems (SquashFS, CramFS, JFFS2, UBIFS) within firmware blobs
- Detecting encrypted or compressed regions using entropy analysis
- Extracting hardcoded credentials, API keys, certificates, or configuration files from firmware
- Performing security assessments of embedded devices in authorized penetration tests
Do not use for analyzing standard desktop application binaries or malware samples that are not firmware images; use dedicated malware analysis tools instead.
Prerequisites
- binwalk v3.x installed (
or from system package manager)pip install binwalk3 - Python 3.8+ with standard libraries (struct, math, hashlib, subprocess)
- SquashFS tools (
) for mounting extracted SquashFS filesystemsunsquashfs - Jefferson for JFFS2 filesystem extraction (
)pip install jefferson - Sasquatch for non-standard SquashFS variants used by vendors like TP-Link and D-Link
utility (GNU binutils) for string extractionstrings- Optional: firmware-mod-kit for repacking modified firmware images
Workflow
Step 1: Initial Firmware Reconnaissance
Perform a signature scan to identify embedded file types and their offsets:
# Basic signature scan - identify all recognized file types binwalk firmware.bin # Scan with verbose output showing confidence levels binwalk -v firmware.bin # Scan for specific file types only binwalk -y "squashfs" firmware.bin binwalk -y "gzip\|lzma\|xz" firmware.bin # Opcode scan to identify CPU architecture binwalk -A firmware.bin # Scan for raw strings to find version info, URLs, credentials binwalk -R "password" firmware.bin binwalk -R "http://" firmware.bin
Step 2: Entropy Analysis
Analyze entropy to identify encrypted, compressed, and plaintext regions:
# Generate entropy plot binwalk -E firmware.bin # Entropy with specific block size for higher resolution binwalk -E -K 256 firmware.bin # Combined entropy and signature scan binwalk -BE firmware.bin
Interpreting entropy values:
- 0.0 - 1.0: Empty or padding regions (null bytes, 0xFF fill)
- 1.0 - 5.0: Plaintext data, code, ASCII strings, configuration
- 5.0 - 7.0: Compressed data (gzip, LZMA, zlib)
- 7.0 - 7.99: Strongly compressed or encrypted data
- ~8.0: Maximum entropy, likely encrypted or random data
Step 3: Extract Embedded Files
Extract all identified components from the firmware image:
# Automatic extraction of known file types binwalk -e firmware.bin # Recursive extraction (matryoshka mode) for nested archives binwalk -Me firmware.bin # Recursive extraction with depth limit binwalk -Me -d 5 firmware.bin # Extract specific file type with custom handler binwalk -D "squashfs filesystem:squashfs:unsquashfs %e" firmware.bin # Manual extraction of data at a known offset dd if=firmware.bin of=extracted.squashfs bs=1 skip=327680 count=4194304
Step 4: Mount and Inspect Extracted Filesystems
Mount extracted filesystems for deep inspection:
# Mount SquashFS filesystem mkdir /tmp/squashfs_root unsquashfs -d /tmp/squashfs_root extracted.squashfs # Mount CramFS filesystem mkdir /tmp/cramfs_root mount -t cramfs -o loop extracted.cramfs /tmp/cramfs_root # Extract JFFS2 filesystem jefferson extracted.jffs2 -d /tmp/jffs2_root # Inspect the extracted filesystem ls -la /tmp/squashfs_root/ find /tmp/squashfs_root -name "*.conf" -o -name "*.cfg" -o -name "*.key" find /tmp/squashfs_root -name "passwd" -o -name "shadow"
Step 5: String Analysis and Credential Discovery
Search extracted filesystem and raw firmware for sensitive data:
# Extract all printable strings strings -a firmware.bin > all_strings.txt strings -n 12 firmware.bin | sort -u > long_strings.txt # Search for credentials and secrets grep -rni "password\|passwd\|secret\|api_key\|token" /tmp/squashfs_root/etc/ grep -rni "BEGIN.*PRIVATE KEY" /tmp/squashfs_root/ # Find hardcoded URLs and endpoints grep -rnoE "https?://[a-zA-Z0-9./?=_-]+" /tmp/squashfs_root/ # Search for certificate files find /tmp/squashfs_root -name "*.pem" -o -name "*.crt" -o -name "*.key" -o -name "*.p12" # Identify busybox and service versions strings /tmp/squashfs_root/bin/busybox | grep "BusyBox v" cat /tmp/squashfs_root/etc/banner 2>/dev/null
Step 6: Generate Firmware Analysis Report
Compile comprehensive extraction and analysis findings:
Report should include: - Firmware metadata (vendor, model, version, build date) - Identified components with offsets and sizes (bootloader, kernel, filesystem, config) - Entropy analysis summary with regions of interest - Extracted filesystem structure and key contents - Discovered credentials, keys, certificates - Identified services, daemons, and their versions - Known CVEs applicable to identified component versions - Recommendations for hardening or vulnerability remediation
Key Concepts
| Term | Definition |
|---|---|
| Firmware | Software embedded in hardware devices providing low-level control; typically contains a bootloader, kernel, root filesystem, and configuration data |
| Entropy Analysis | Statistical measurement of randomness in binary data; high entropy indicates encryption or compression, low entropy indicates plaintext or structured data |
| SquashFS | Read-only compressed filesystem commonly used in embedded Linux devices; supports LZMA, gzip, LZO, and zstd compression |
| Magic Bytes | Known byte sequences at fixed offsets that identify file types; binwalk uses a database of magic signatures to detect embedded files |
| Matryoshka Extraction | Recursive extraction mode where binwalk re-scans extracted files for additional embedded content, handling deeply nested archives |
| CramFS | Compressed ROM filesystem designed for embedded systems with limited flash storage; supports only zlib compression |
| JFFS2 | Journalling Flash File System version 2, designed for NOR and NAND flash memory in embedded devices |
Tools & Systems
- binwalk: Primary firmware analysis tool for signature scanning, entropy analysis, and automated extraction of embedded files
- unsquashfs: SquashFS extraction utility for mounting read-only compressed filesystems found in router and IoT firmware
- jefferson: Python tool for extracting JFFS2 flash filesystem images commonly found in embedded devices
- sasquatch: Patched SquashFS utility supporting non-standard vendor-modified SquashFS variants
- firmware-mod-kit: Toolkit for extracting, modifying, and repacking firmware images for security testing
Common Scenarios
Scenario: Extracting and Auditing Router Firmware for Hardcoded Credentials
Context: A security researcher is performing an authorized assessment of a consumer router. The firmware update file was downloaded from the vendor's support page. The goal is to identify hardcoded credentials, insecure default configurations, and known vulnerable components.
Approach:
- Run
to perform initial extractionbinwalk -e firmware.bin - Use
to check entropy and identify encrypted regionsbinwalk -E firmware.bin - Locate the SquashFS root filesystem in the extracted output
- Mount with
and inspectunsquashfs
,/etc/passwd
, and web server configs/etc/shadow - Search for hardcoded credentials with
grep -rni "password" /tmp/root/etc/ - Identify service versions and cross-reference with CVE databases
- Check for debug interfaces (telnet, UART, JTAG references) in startup scripts
- Examine web application code for authentication bypass or command injection
Pitfalls:
- Some vendors use non-standard SquashFS with custom compression; use sasquatch instead of unsquashfs
- Encrypted firmware requires decryption keys often found in bootloader or previous unencrypted versions
- Firmware headers may need to be stripped before binwalk can identify the embedded filesystem
- Obfuscated strings may evade simple grep searches; use entropy analysis to locate data blobs
Output Format
FIRMWARE EXTRACTION REPORT ==================================== Firmware: TP-Link TL-WR841N v14 File: wr841nv14_en_3_16_9_up.bin Size: 3,932,160 bytes (3.75 MB) SHA-256: a1b2c3d4e5f6... SIGNATURE SCAN RESULTS Offset Type Size ------ ---- ---- 0x00000000 U-Boot bootloader header 64 bytes 0x00020000 LZMA compressed data 1,048,576 bytes 0x00120000 SquashFS filesystem v4.0 2,752,512 bytes 0x003B0000 Configuration partition 131,072 bytes ENTROPY ANALYSIS Region 0x000000-0x020000: 4.21 (bootloader - plaintext code) Region 0x020000-0x120000: 7.89 (kernel - LZMA compressed) Region 0x120000-0x3B0000: 7.45 (filesystem - SquashFS compressed) Region 0x3B0000-0x3C0000: 1.12 (config - mostly empty) EXTRACTED FILESYSTEM Root filesystem: SquashFS v4.0, LZMA compression Total files: 847 Total dirs: 112 BusyBox version: 1.19.4 SECURITY FINDINGS [CRITICAL] Hardcoded root password in /etc/shadow (hash: $1$...) [HIGH] Telnet daemon enabled by default in /etc/init.d/rcS [HIGH] Private RSA key at /etc/ssl/private/server.key [MEDIUM] BusyBox 1.19.4 (CVE-2021-42373, CVE-2021-42374) [MEDIUM] Dropbear SSH 2014.63 (CVE-2016-3116) [LOW] UPnP service enabled by default