Hacktricks-skills structural-exploit-detection

Use this skill whenever analyzing suspicious files for 0-click exploit detection, forensic investigation of mobile malware, or validating file format structural integrity. Trigger on any request about PDF exploits, WebP vulnerabilities, font bytecode analysis, DNG/TIFF forensics, HEIF/AVIF parsing issues, or general file format exploit detection. This skill helps detect exploit chains by validating structural invariants rather than relying on byte signatures.

install

source · Clone the upstream repo

git clone https://github.com/abelrguezr/hacktricks-skills

manifest:

skills/generic-methodologies-and-resources/basic-forensic-methodology/specific-software-file-type-tricks/structural-file-format-exploit-detection/SKILL.MD

source content

Structural File-Format Exploit Detection

This skill provides practical techniques to detect 0-click mobile exploit files by validating structural invariants of their formats instead of relying on byte signatures. The approach generalizes across samples, polymorphic variants, and future exploits that abuse the same parser logic.

Key principle: Encode structural impossibilities and cross-field inconsistencies that only appear when a vulnerable decoder/parser state is reached.

When to Use This Skill

Use this skill when:

Investigating suspicious files that may contain 0-click exploits
Building forensic detection rules for mobile malware
Validating file format integrity in security gateways
Analyzing exploit chains (FORCEDENTRY, BLASTPASS, TRIANGULATION, LANDFALL, etc.)
Creating structural detection rules that work without payload signatures

Why Structure, Not Signatures

When weaponized samples are unavailable and payload bytes mutate, traditional IOC/YARA patterns fail. Structural detection inspects the container's declared layout versus what is mathematically or semantically possible for the format implementation.

Typical checks:

Validate table sizes and bounds derived from the spec and safe implementations
Flag illegal/undocumented opcodes or state transitions in embedded bytecode
Cross-check metadata vs actual encoded stream components
Detect contradictory fields that indicate parser confusion or integer overflow set-ups

Detection Patterns by Format

PDF/JBIG2 – FORCEDENTRY (CVE-2021-30860)

Target: JBIG2 symbol dictionaries embedded inside PDFs (often used in mobile MMS parsing).

Structural signals:

Contradictory dictionary state that cannot occur in benign content but is required to trigger the overflow in arithmetic decoding
Suspicious use of global segments combined with abnormal symbol counts during refinement coding

Detection logic:

if input_symbols_count == 0 and (ex_syms > 0 and ex_syms < 4):
    mark_malicious("JBIG2 impossible symbol dictionary state")

Practical triage:

Identify and extract JBIG2 streams from the PDF using
```
pdfid
```
,
```
pdf-parser
```
, or
```
peepdf
```
Verify arithmetic coding flags and symbol dictionary parameters against the JBIG2 spec

Notes: Works without embedded payload signatures. Low false positive rate because the flagged state is mathematically inconsistent.

WebP/VP8L – BLASTPASS (CVE-2023-4863)

Target: WebP lossless (VP8L) Huffman prefix-code tables.

Structural signals:

Total size of constructed Huffman tables exceeds the safe upper bound expected by reference/patched implementations, implying the overflow precondition

Detection logic:

let total_size = sum(table_sizes)
if total_size > 2954:   # FIXED_TABLE_SIZE + MAX_TABLE_SIZE
    mark_malicious("VP8L oversized Huffman tables")

Practical triage:

Check WebP container chunks: VP8X + VP8L
Parse VP8L prefix codes and compute actual allocated table sizes

Notes: Robust against byte-level polymorphism of the payload. Bound is derived from upstream limits/patch analysis.

TrueType – TRIANGULATION (CVE-2023-41990)

Target: TrueType bytecode inside fpgm/prep/glyf programs.

Structural signals:

Presence of undocumented/forbidden opcodes in Apple's interpreter used by the exploit chain

Detection logic:

switch opcode:
  case 0x8F, 0x90:
    mark_malicious("Undocumented TrueType bytecode")
  default:
    continue

Practical triage:

Dump font tables using
```
fontTools
```
/
```
ttx
```
and scan fpgm/prep/glyf programs
No need to fully emulate the interpreter to get value from presence checks

Notes: May produce rare false positives if nonstandard fonts include unknown opcodes; validate with secondary tooling.

DNG/TIFF – CVE-2025-43300

Target: DNG/TIFF image metadata vs actual component count in encoded stream (e.g., JPEG-Lossless SOF3).

Structural signals:

Inconsistency between EXIF/IFD fields (SamplesPerPixel, PhotometricInterpretation) and the component count parsed from the image stream header used by the pipeline

Detection logic:

if samples_per_pixel == 2 and sof3_components == 1:
    mark_malicious("DNG/TIFF metadata vs. stream mismatch")

Practical triage:

Parse primary IFD and EXIF tags
Locate and parse the embedded JPEG-Lossless header (SOF3) and compare component counts

Notes: Reported exploited in the wild; excellent candidate for structural consistency checks.

DNG/TIFF – LANDFALL (CVE-2025-21042)

Target: DNG (TIFF-derived) images carrying an embedded ZIP archive appended at EOF to stage native payloads after parser RCE.

Structural signals:

File magic indicates TIFF/DNG (
```
II*\x00
```
or
```
MM\x00*
```
) but filename mimics JPEG (e.g.,
```
.jpg
```
/
```
.jpeg
```
WhatsApp naming)
Presence of a ZIP Local File Header or EOCD magic near EOF (
```
PK\x03\x04
```
or
```
PK\x05\x06
```
) that is not referenced by any TIFF IFD data region
Unusually large trailing data beyond the last referenced IFD data block (hundreds of KB to MB), consistent with a bundled archive of .so modules

Detection logic:

if is_tiff_dng(magic):
    ext = file_extension()
    if ext in {".jpg", ".jpeg"}: mark_suspicious("Extension/magic mismatch: DNG vs JPEG")

    zip_off = rfind_any(["PK\x05\x06", "PK\x03\x04"], search_window_last_n_bytes=8*1024*1024)
    if zip_off >= 0:
        end_dng = approx_end_of_tiff_data()
        if zip_off > end_dng + 0x200:
            mark_malicious("DNG with appended ZIP payload (LANDFALL-style)")

Practical triage:

Identify format vs name:

file sample; exiftool -s -FileType -MIMEType sample

Locate ZIP footer/header near EOF and carve:

off=$(grep -aboa -E $'PK\x05\x06|PK\x03\x04' sample.dng | tail -n1 | cut -d: -f1)
dd if=sample.dng of=payload.zip bs=1 skip="$off"
zipdetails -v payload.zip; unzip -l payload.zip

Sanity-check TIFF data regions don't overlap the carved ZIP region:

tiffdump -D sample.dng | egrep 'StripOffsets|TileOffsets|JPEGInterchangeFormat'

One-shot carving (coarse):
```
binwalk -eM sample.dng
```

Notes: Exploited in the wild against Samsung's libimagecodec.quram.so. The appended ZIP contained native modules (e.g., loader + SELinux policy editor) extracted/executed post-RCE.

HEIF/AVIF – libheif & libde265 (CVE-2024-41311, CVE-2025-29482, CVE-2025-65586)

Target: HEIF/AVIF containers parsed by libheif (and ImageIO/OpenImageIO builds that bundle it).

Structural signals:

Overlay items (iloc/iref) whose source rectangles exceed the base image dimensions or whose offsets are negative/overflowing → triggers ImageOverlay::parse out-of-bounds (CVE-2024-41311)
Grid items referencing non-existent item IDs (ImageItem_Grid::get_decoder NULL deref, CVE-2025-43967)
SAO/loop-filter parameters or tile counts that force table allocations larger than the max allowed by libde265 (CVE-2025-29482)
Box length/extent sizes that point past EOF (typical in CVE-2025-65586 PoCs)

Detection logic:

# HEIF overlay bounds check
for overlay in heif_overlays:
    if overlay.x < 0 or overlay.y < 0: mark_malicious("HEIF overlay negative offset")
    if overlay.x + overlay.w > base.w or overlay.y + overlay.h > base.h:
        mark_malicious("HEIF overlay exceeds base image (CVE-2024-41311 pattern)")

# Grid item reference validation
for grid in heif_grids:
    if any(ref_id not in item_ids):
        mark_malicious("HEIF grid references missing item (CVE-2025-43967 pattern)")

# SAO / slice allocation guard
if sao_band_count > 32 or (tile_cols * tile_rows) > MAX_TILES or sao_eo_class not in {0..3}:
    mark_malicious("HEIF SAO/tiling exceeds safe bounds (CVE-2025-29482 pattern)")

Practical triage:

Quick metadata sanity without full decode:

heif-info sample.heic
oiiotool --info --stats sample.heic

Validate extents versus file size:

heif-convert --verbose sample.heic /dev/null | grep -i extent

Carve suspicious boxes for manual inspection:

dd if=sample.heic bs=1 skip=$((box_off)) count=$((box_len)) of=box.bin

Notes: These checks catch malformed structure before heavy decode; useful for mail/MMS gateways that only need allow/deny decisions. libheif limits shift across versions; re-baseline constants when upstream changes.

Implementation Patterns

A practical scanner should:

Auto-detect file type and dispatch only relevant analyzers (PDF/JBIG2, WebP/VP8L, TTF, DNG/TIFF, HEIF/AVIF)
Stream/partial-parse to minimize allocations and enable early termination
Run analyses in parallel (thread-pool) for bulk triage

Example workflow with ElegantBouncer (open-source Rust implementation):

# Scan a path recursively with structural detectors
elegant-bouncer --scan /path/to/directory

# Optional TUI for parallel scanning and real-time alerts
elegant-bouncer --tui --scan /path/to/samples

DFIR Tips and Edge Cases

Embedded objects: PDFs may embed images (JBIG2) and fonts (TrueType); extract and recursively scan
Decompression safety: Use libraries that hard-limit tables/buffers before allocation
False positives: Keep rules conservative, favor contradictions that are impossible under the spec
Version drift: Re-baseline bounds (e.g., VP8L table sizes) when upstream parsers change limits

Related Tools

Tool	Purpose
ElegantBouncer	Structural scanner for the detections above
pdfid/pdf-parser/peepdf	PDF object extraction and static analysis
pdfcpu	PDF linter/sanitizer
fontTools/ttx	Dump TrueType tables and bytecode
exiftool	Read TIFF/DNG/EXIF metadata
dwebp/webpmux	Parse WebP metadata and chunks
heif-info/heif-convert (libheif)	HEIF/AVIF structure inspection
oiiotool	Validate HEIF/AVIF via OpenImageIO
binwalk	Carve embedded files from containers

Quick Reference: Detection Scripts

Use the bundled scripts for common detection tasks:

```
scripts/check_dng_appended_zip.sh
```
- Detects appended ZIP payloads in DNG/TIFF files (LANDFALL pattern)
```
scripts/check_truetype_opcodes.py
```
- Scans TrueType bytecode for undocumented opcodes (TRIANGULATION pattern)
```
scripts/validate_heif_structure.py
```
- Validates HEIF overlay bounds and references

Hacktricks-skills structural-exploit-detection

Structural File-Format Exploit Detection

When to Use This Skill

Why Structure, Not Signatures

Detection Patterns by Format

PDF/JBIG2 – FORCEDENTRY (CVE-2021-30860)

WebP/VP8L – BLASTPASS (CVE-2023-4863)

TrueType – TRIANGULATION (CVE-2023-41990)

DNG/TIFF – CVE-2025-43300

DNG/TIFF – LANDFALL (CVE-2025-21042)

HEIF/AVIF – libheif & libde265 (CVE-2024-41311, CVE-2025-29482, CVE-2025-65586)

Implementation Patterns

DFIR Tips and Edge Cases

Related Tools

Quick Reference: Detection Scripts

References