Hacktricks-skills forensic-file-analysis
Forensic analysis techniques for specific file types and software. Use this skill whenever the user needs to analyze files for forensic purposes, investigate suspicious files, extract hidden data, deobfuscate content, or examine file structures. Trigger on mentions of: file forensics, artifact analysis, deobfuscation, hidden data extraction, file format analysis, .pyc files, browser artifacts, Office documents, PDFs, images (PNG/SVG), archives (ZIP), video/audio files, Mach-O binaries, or any file type investigation.
install
source · Clone the upstream repo
git clone https://github.com/abelrguezr/hacktricks-skills
manifest:
skills/generic-methodologies-and-resources/basic-forensic-methodology/specific-software-file-type-tricks/specific-software-file-type-tricks/SKILL.MDsource content
Forensic File Analysis
A comprehensive guide to forensic analysis techniques for specific file types and software.
Quick Reference
| File Type | Key Techniques |
|---|---|
| .pyc | Bytecode decompilation, string extraction |
| Browser | History, cookies, cache, downloads |
| VBS/CScript | Deobfuscation, string extraction |
| Office | Hidden metadata, embedded objects |
| Hidden layers, embedded files | |
| PNG | Metadata, hidden channels |
| ZIP | Password recovery, hidden entries |
| Video/Audio | Metadata, hidden streams |
| Mach-O | Entitlements, code signing |
Python Compiled Files (.pyc)
Decompile .pyc Files
# Using uncompyle6 (works with Python 2.7-3.10) pip install uncompyle6 uncompyle6 -o output_dir file.pyc # Using pydc (Python Decompiler) pip install pydc pydc file.pyc # Using decompyle3 (Python 3.6-3.9) pip install decompyle3 decmpyle3 file.pyc
Extract Strings from .pyc
# Extract all strings strings file.pyc | grep -i "password\|api\|key\|token" # Use pycdc for structured extraction pip install pycdc pycdc file.pyc
Analyze .pyc Metadata
# Check Python version and timestamp xxd file.pyc | head -20 # The first 4 bytes are the magic number (Python version) # Bytes 4-7 are the timestamp
Browser Artifacts
Chrome/Chromium Artifacts
# Location: ~/.config/google-chrome/Default/ # or ~/Library/Application Support/Google/Chrome/Default/ (macOS) # Analyze history (SQLite) sqlite3 ~/.config/google-chrome/Default/History \ "SELECT * FROM urls ORDER BY last_visit_time DESC LIMIT 50;" # Extract cookies (encrypted) sqlite3 ~/.config/google-chrome/Default/Cookies \ "SELECT * FROM cookies;" # Download history sqlite3 ~/.config/google-chrome/Default/History \ "SELECT * FROM downloads;" # Cache analysis find ~/.config/google-chrome/Default/Cache -type f -exec file {} \;
Firefox Artifacts
# Location: ~/.mozilla/firefox/*.default-release/ # History sqlite3 ~/.mozilla/firefox/*.default-release/places.sqlite \ "SELECT * FROM moz_historyvisits ORDER BY date DESC LIMIT 50;" # Cookies sqlite3 ~/.mozilla/firefox/*.default-release/cookies.sqlite \ "SELECT * FROM moz_cookies;" # Downloads sqlite3 ~/.mozilla/firefox/*.default-release/places.sqlite \ "SELECT * FROM moz_downloads;"
Safari Artifacts (macOS)
# Location: ~/Library/Safari/ # History sqlite3 ~/Library/Safari/History \ "SELECT * FROM history_visits ORDER BY visited_at DESC LIMIT 50;" # Cookies sqlite3 ~/Library/Safari/Cookies/Cookies.db \ "SELECT * FROM cookies;"
VBS/CScript Deobfuscation
Basic Deobfuscation
# Extract strings strings script.vbs | grep -v "^\\s*$" # Remove comments and empty lines grep -v "^'" script.vbs | grep -v "^\\s*$" > cleaned.vbs # Decode Base64 strings # Look for: str = Base64Decode("...") # Then decode: echo "BASE64_STRING" | base64 -d
Advanced Deobfuscation
# Use VBDeobfuscator pip install vbdeobfuscator vbdeobfuscator script.vbs # Use strings with context strings -n 10 script.vbs | grep -E "(http|https|ftp|cmd|powershell)" # Look for encoded commands strings script.vbs | grep -E "^.{20,}$" | head -20
Common Obfuscation Patterns
# Eval-based Eval(StringReplace(...)) # Hex encoding ChrW(&H41) = "A" # Base64 Base64Decode("...") # XOR encoding For i = 1 To Len(str) str = Chr(Asc(Mid(str, i, 1)) Xor key) Next
Office File Analysis
Document Structure
# Office files are ZIP archives unzip -l document.docx unzip -p document.docx word/document.xml | head -100 # Extract all content unzip -o document.docx -d extracted/
Hidden Metadata
# Using exiftool exiftool document.docx # Using officeparser pip install officeparser officeparser document.docx # Check for hidden text unzip -p document.docx word/document.xml | grep -i "w:vanish"
Embedded Objects
# Find embedded files unzip -l document.docx | grep -i "embed" # Extract embedded objects unzip -p document.docx word/embeddings/embed1.bin > embedded.bin file embedded.bin
Macros Analysis
# Extract VBA macros unzip -p document.docm word/vbaProject.bin > vba.bin # Use olemview or oletools pip install oletools vba_extract.py document.docm vba_analyze.py extracted_vba/
PDF File Analysis
Basic Analysis
# Check PDF structure pdfinfo document.pdf # Extract text pdftotext document.pdf - | head -100 # Extract images pdfimages -list document.pdf pdfimages -all document.pdf output_
Hidden Content
# Look for hidden layers qpdf --show-nesting document.pdf # Extract all objects qpdf --object-streams=disable document.pdf cleaned.pdf strings cleaned.pdf | grep -i "password\|secret\|key" # Check for embedded files pdfdetach -list document.pdf pdfdetach -saveall document.pdf
JavaScript Extraction
# Extract embedded JavaScript pdfid document.pdf # Use pdf-parser pip install pdf-parser pdf-parser document.pdf # Look for JS in objects strings document.pdf | grep -A 20 "\/JS"
Metadata Analysis
# Full metadata exiftool document.pdf # Check for modification history pdfinfo -box document.pdf
PNG Tricks
Metadata Extraction
# Using exiftool exiftool image.png # Using ztxt (for text chunks) ztxt image.png # Using pngcheck pngcheck -v image.png
Hidden Data in Chunks
# List all chunks pngcheck -v image.png | grep -E "(tEXt|iTXt|zTXt|tIME|pHYs)" # Extract specific chunks python3 -c " import struct with open('image.png', 'rb') as f: f.read(8) # PNG signature while True: length = struct.unpack('>I', f.read(4))[0] chunk_type = f.read(4) data = f.read(length) if chunk_type in [b'tEXt', b'iTXt', b'zTXt']: print(f'{chunk_type}: {data}') f.read(4) # CRC if chunk_type == b'IEND': break "
Steganography Detection
# Check for hidden data in LSB zsteg image.png # Using steghide (if password protected) steghide extract -sf image.png # Using stegsolve java -jar stegsolve.jar
PNG Structure Analysis
# View raw structure xxd image.png | head -50 # Check for malformed chunks pngcheck -v image.png
ZIP Tricks
Basic Analysis
# List contents unzip -l archive.zip # Check for password protection zipinfo archive.zip # Extract with password unzip -P password archive.zip
Hidden Entries
# Look for hidden files unzip -l archive.zip | grep -i "\.\|hidden\|secret" # Check for alternate data streams zipinfo -v archive.zip # Extract all including hidden unzip -o archive.zip -d extracted/
Password Recovery
# Using fcrackzip fcrackzip -u -D -p /usr/share/wordlists/rockyou.txt archive.zip # Using john zip2john archive.zip > hash.txt john --wordlist=/usr/share/wordlists/rockyou.txt hash.txt # Using hashcat zip2john archive.zip > hash.txt hashcat -m 13200 hash.txt /usr/share/wordlists/rockyou.txt
Malformed ZIP Detection
# Check ZIP structure zipinfo -v archive.zip # Look for split archives file archive.zip* # Check for nested archives unzip -l archive.zip | grep -i "\.zip\|\.rar\|\.7z"
Video and Audio File Analysis
Metadata Extraction
# Using exiftool exiftool video.mp4 exiftool audio.mp3 # Using ffprobe ffprobe -v quiet -print_format json -show_format -show_streams video.mp4 # Using mediainfo mediainfo video.mp4
Hidden Streams
# List all streams ffprobe video.mp4 | grep -i "stream" # Extract hidden audio streams ffmpeg -i video.mp4 -map 0:a:1 hidden_audio.mp3 # Extract hidden video streams ffmpeg -i video.mp4 -map 0:v:1 hidden_video.mp4
Steganography Detection
# Check for hidden data in audio audacity audio.mp3 # Open and inspect waveform # Using steghide steghide extract -sf audio.mp3 # Check video for hidden frames ffmpeg -i video.mp4 -vf "select=eq(pict_type,I)" -vframes 100 frame_%03d.png
Carving Hidden Files
# Using foremost foremost -i video.mp4 -t mp3,mp4,png,jpg # Using photorec photorec video.mp4
Mach-O Entitlements and IPSW Indexing
Entitlements Extraction
# Using plutil plutil -extract entitlements raw binary.app/Contents/MachO/binary > entitlements.plist # Using otool otool -l binary.app/Contents/MachO/binary | grep -A 10 "com.apple" # Using codesign codesign -d --entitlements :- binary.app
Code Signing Analysis
# Check signature codesign -dv binary.app # Extract signature codesign -d --verbose=4 binary.app # Verify signature codesign -v binary.app
IPSW Analysis
# Extract IPSW bsdtar -xf firmware.ipsw # Analyze manifest plutil -p BuildManifest.plist # Check for vulnerabilities strings BuildManifest.plist | grep -i "vuln\|exploit\|bypass"
Mobile Device Forensics
# Extract from backup ibackupextract -b backup_path -o output/ # Analyze SQLite databases sqlite3 output/*/Manifest.db "SELECT * FROM files;"
General Forensic Techniques
File Carving
# Using foremost foremost -i disk_image -t png,jpg,docx,pdf,zip # Using scalpel scalpel disk_image -o output/ # Using photorec photorec disk_image
String Extraction
# Basic strings strings file | grep -i "password\|api\|key\|token" # With context strings -n 10 file | grep -B 5 -A 5 "sensitive" # Unicode strings strings -e l file
Hash Analysis
# Calculate hashes md5sum file sha256sum file # Check against known databases urlquery "md5:HASH_VALUE"
Timeline Analysis
# Extract timestamps ls -la --time-style=full-iso file # Using plaso (log2timeline) plaso --status-view none --output-file timeline.plaso file psort --output-format timeline timeline.plaso > timeline.txt
Best Practices
- Always work on copies - Never analyze original evidence files
- Document everything - Keep detailed notes of all commands and findings
- Verify findings - Cross-reference results with multiple tools
- Maintain chain of custody - Track all file movements and modifications
- Use write-blockers - When analyzing disk images, use hardware or software write-blockers
Tool Installation
# Core tools sudo apt install foremost scalpel photorec exiftool # Python tools pip install oletools pdf-parser uncompyle6 pydc pycdc # Forensic frameworks pip install volatility3 yara-python # Hash tools sudo apt install hashcat john
Quick Commands Reference
# File type identification file filename # Quick string search strings filename | grep -i "keyword" # Check for hidden data exiftool filename # Extract from archives unzip -l archive.zip # Analyze PDF pdfinfo document.pdf # Check image metadata exiftool image.png
When to Use This Skill
Use this skill when:
- Investigating suspicious files
- Extracting hidden data from documents
- Analyzing file structures for forensics
- Deobfuscating scripts or binaries
- Recovering data from corrupted files
- Examining browser or application artifacts
- Performing digital forensics investigations
- Analyzing malware samples
- Extracting metadata from any file type