Hacktricks-skills structural-exploit-detection
Use this skill whenever analyzing suspicious files for 0-click exploit detection, forensic investigation of mobile malware, or validating file format structural integrity. Trigger on any request about PDF exploits, WebP vulnerabilities, font bytecode analysis, DNG/TIFF forensics, HEIF/AVIF parsing issues, or general file format exploit detection. This skill helps detect exploit chains by validating structural invariants rather than relying on byte signatures.
git clone https://github.com/abelrguezr/hacktricks-skills
skills/generic-methodologies-and-resources/basic-forensic-methodology/specific-software-file-type-tricks/structural-file-format-exploit-detection/SKILL.MDStructural File-Format Exploit Detection
This skill provides practical techniques to detect 0-click mobile exploit files by validating structural invariants of their formats instead of relying on byte signatures. The approach generalizes across samples, polymorphic variants, and future exploits that abuse the same parser logic.
Key principle: Encode structural impossibilities and cross-field inconsistencies that only appear when a vulnerable decoder/parser state is reached.
When to Use This Skill
Use this skill when:
- Investigating suspicious files that may contain 0-click exploits
- Building forensic detection rules for mobile malware
- Validating file format integrity in security gateways
- Analyzing exploit chains (FORCEDENTRY, BLASTPASS, TRIANGULATION, LANDFALL, etc.)
- Creating structural detection rules that work without payload signatures
Why Structure, Not Signatures
When weaponized samples are unavailable and payload bytes mutate, traditional IOC/YARA patterns fail. Structural detection inspects the container's declared layout versus what is mathematically or semantically possible for the format implementation.
Typical checks:
- Validate table sizes and bounds derived from the spec and safe implementations
- Flag illegal/undocumented opcodes or state transitions in embedded bytecode
- Cross-check metadata vs actual encoded stream components
- Detect contradictory fields that indicate parser confusion or integer overflow set-ups
Detection Patterns by Format
PDF/JBIG2 – FORCEDENTRY (CVE-2021-30860)
Target: JBIG2 symbol dictionaries embedded inside PDFs (often used in mobile MMS parsing).
Structural signals:
- Contradictory dictionary state that cannot occur in benign content but is required to trigger the overflow in arithmetic decoding
- Suspicious use of global segments combined with abnormal symbol counts during refinement coding
Detection logic:
if input_symbols_count == 0 and (ex_syms > 0 and ex_syms < 4): mark_malicious("JBIG2 impossible symbol dictionary state")
Practical triage:
- Identify and extract JBIG2 streams from the PDF using
,pdfid
, orpdf-parserpeepdf - Verify arithmetic coding flags and symbol dictionary parameters against the JBIG2 spec
Notes: Works without embedded payload signatures. Low false positive rate because the flagged state is mathematically inconsistent.
WebP/VP8L – BLASTPASS (CVE-2023-4863)
Target: WebP lossless (VP8L) Huffman prefix-code tables.
Structural signals:
- Total size of constructed Huffman tables exceeds the safe upper bound expected by reference/patched implementations, implying the overflow precondition
Detection logic:
let total_size = sum(table_sizes) if total_size > 2954: # FIXED_TABLE_SIZE + MAX_TABLE_SIZE mark_malicious("VP8L oversized Huffman tables")
Practical triage:
- Check WebP container chunks: VP8X + VP8L
- Parse VP8L prefix codes and compute actual allocated table sizes
Notes: Robust against byte-level polymorphism of the payload. Bound is derived from upstream limits/patch analysis.
TrueType – TRIANGULATION (CVE-2023-41990)
Target: TrueType bytecode inside fpgm/prep/glyf programs.
Structural signals:
- Presence of undocumented/forbidden opcodes in Apple's interpreter used by the exploit chain
Detection logic:
switch opcode: case 0x8F, 0x90: mark_malicious("Undocumented TrueType bytecode") default: continue
Practical triage:
- Dump font tables using
/fontTools
and scan fpgm/prep/glyf programsttx - No need to fully emulate the interpreter to get value from presence checks
Notes: May produce rare false positives if nonstandard fonts include unknown opcodes; validate with secondary tooling.
DNG/TIFF – CVE-2025-43300
Target: DNG/TIFF image metadata vs actual component count in encoded stream (e.g., JPEG-Lossless SOF3).
Structural signals:
- Inconsistency between EXIF/IFD fields (SamplesPerPixel, PhotometricInterpretation) and the component count parsed from the image stream header used by the pipeline
Detection logic:
if samples_per_pixel == 2 and sof3_components == 1: mark_malicious("DNG/TIFF metadata vs. stream mismatch")
Practical triage:
- Parse primary IFD and EXIF tags
- Locate and parse the embedded JPEG-Lossless header (SOF3) and compare component counts
Notes: Reported exploited in the wild; excellent candidate for structural consistency checks.
DNG/TIFF – LANDFALL (CVE-2025-21042)
Target: DNG (TIFF-derived) images carrying an embedded ZIP archive appended at EOF to stage native payloads after parser RCE.
Structural signals:
- File magic indicates TIFF/DNG (
orII*\x00
) but filename mimics JPEG (e.g.,MM\x00*
/.jpg
WhatsApp naming).jpeg - Presence of a ZIP Local File Header or EOCD magic near EOF (
orPK\x03\x04
) that is not referenced by any TIFF IFD data regionPK\x05\x06 - Unusually large trailing data beyond the last referenced IFD data block (hundreds of KB to MB), consistent with a bundled archive of .so modules
Detection logic:
if is_tiff_dng(magic): ext = file_extension() if ext in {".jpg", ".jpeg"}: mark_suspicious("Extension/magic mismatch: DNG vs JPEG") zip_off = rfind_any(["PK\x05\x06", "PK\x03\x04"], search_window_last_n_bytes=8*1024*1024) if zip_off >= 0: end_dng = approx_end_of_tiff_data() if zip_off > end_dng + 0x200: mark_malicious("DNG with appended ZIP payload (LANDFALL-style)")
Practical triage:
- Identify format vs name:
file sample; exiftool -s -FileType -MIMEType sample - Locate ZIP footer/header near EOF and carve:
off=$(grep -aboa -E $'PK\x05\x06|PK\x03\x04' sample.dng | tail -n1 | cut -d: -f1) dd if=sample.dng of=payload.zip bs=1 skip="$off" zipdetails -v payload.zip; unzip -l payload.zip - Sanity-check TIFF data regions don't overlap the carved ZIP region:
tiffdump -D sample.dng | egrep 'StripOffsets|TileOffsets|JPEGInterchangeFormat' - One-shot carving (coarse):
binwalk -eM sample.dng
Notes: Exploited in the wild against Samsung's libimagecodec.quram.so. The appended ZIP contained native modules (e.g., loader + SELinux policy editor) extracted/executed post-RCE.
HEIF/AVIF – libheif & libde265 (CVE-2024-41311, CVE-2025-29482, CVE-2025-65586)
Target: HEIF/AVIF containers parsed by libheif (and ImageIO/OpenImageIO builds that bundle it).
Structural signals:
- Overlay items (iloc/iref) whose source rectangles exceed the base image dimensions or whose offsets are negative/overflowing → triggers ImageOverlay::parse out-of-bounds (CVE-2024-41311)
- Grid items referencing non-existent item IDs (ImageItem_Grid::get_decoder NULL deref, CVE-2025-43967)
- SAO/loop-filter parameters or tile counts that force table allocations larger than the max allowed by libde265 (CVE-2025-29482)
- Box length/extent sizes that point past EOF (typical in CVE-2025-65586 PoCs)
Detection logic:
# HEIF overlay bounds check for overlay in heif_overlays: if overlay.x < 0 or overlay.y < 0: mark_malicious("HEIF overlay negative offset") if overlay.x + overlay.w > base.w or overlay.y + overlay.h > base.h: mark_malicious("HEIF overlay exceeds base image (CVE-2024-41311 pattern)") # Grid item reference validation for grid in heif_grids: if any(ref_id not in item_ids): mark_malicious("HEIF grid references missing item (CVE-2025-43967 pattern)") # SAO / slice allocation guard if sao_band_count > 32 or (tile_cols * tile_rows) > MAX_TILES or sao_eo_class not in {0..3}: mark_malicious("HEIF SAO/tiling exceeds safe bounds (CVE-2025-29482 pattern)")
Practical triage:
- Quick metadata sanity without full decode:
heif-info sample.heic oiiotool --info --stats sample.heic - Validate extents versus file size:
heif-convert --verbose sample.heic /dev/null | grep -i extent - Carve suspicious boxes for manual inspection:
dd if=sample.heic bs=1 skip=$((box_off)) count=$((box_len)) of=box.bin
Notes: These checks catch malformed structure before heavy decode; useful for mail/MMS gateways that only need allow/deny decisions. libheif limits shift across versions; re-baseline constants when upstream changes.
Implementation Patterns
A practical scanner should:
- Auto-detect file type and dispatch only relevant analyzers (PDF/JBIG2, WebP/VP8L, TTF, DNG/TIFF, HEIF/AVIF)
- Stream/partial-parse to minimize allocations and enable early termination
- Run analyses in parallel (thread-pool) for bulk triage
Example workflow with ElegantBouncer (open-source Rust implementation):
# Scan a path recursively with structural detectors elegant-bouncer --scan /path/to/directory # Optional TUI for parallel scanning and real-time alerts elegant-bouncer --tui --scan /path/to/samples
DFIR Tips and Edge Cases
- Embedded objects: PDFs may embed images (JBIG2) and fonts (TrueType); extract and recursively scan
- Decompression safety: Use libraries that hard-limit tables/buffers before allocation
- False positives: Keep rules conservative, favor contradictions that are impossible under the spec
- Version drift: Re-baseline bounds (e.g., VP8L table sizes) when upstream parsers change limits
Related Tools
| Tool | Purpose |
|---|---|
| ElegantBouncer | Structural scanner for the detections above |
| pdfid/pdf-parser/peepdf | PDF object extraction and static analysis |
| pdfcpu | PDF linter/sanitizer |
| fontTools/ttx | Dump TrueType tables and bytecode |
| exiftool | Read TIFF/DNG/EXIF metadata |
| dwebp/webpmux | Parse WebP metadata and chunks |
| heif-info/heif-convert (libheif) | HEIF/AVIF structure inspection |
| oiiotool | Validate HEIF/AVIF via OpenImageIO |
| binwalk | Carve embedded files from containers |
Quick Reference: Detection Scripts
Use the bundled scripts for common detection tasks:
- Detects appended ZIP payloads in DNG/TIFF files (LANDFALL pattern)scripts/check_dng_appended_zip.sh
- Scans TrueType bytecode for undocumented opcodes (TRIANGULATION pattern)scripts/check_truetype_opcodes.py
- Validates HEIF overlay bounds and referencesscripts/validate_heif_structure.py
References
- ELEGANTBOUNCER: When You Can't Get the Samples but Still Need to Catch the Threat
- ElegantBouncer project (GitHub)
- Researching FORCEDENTRY: Detecting the exploit with no samples
- Researching BLASTPASS – Detecting the exploit inside a WebP file
- Researching TRIANGULATION – Detecting CVE-2023-41990
- CVE-2025-43300: Critical vulnerability found in Apple's DNG image processing
- LANDFALL: New Commercial-Grade Android Spyware