Hacktricks-skills stego-detection

Detect and analyze steganographic payloads in files, especially malware hiding data in images, metadata, and trailing bytes. Use this skill whenever the user needs to investigate suspicious files, analyze potential steganography, hunt for hidden payloads, or triage files that might contain concealed data. Trigger on mentions of steganography, hidden payloads, suspicious images, file forensics, malware analysis, or when examining files that seem to contain more than they should.

install
source · Clone the upstream repo
git clone https://github.com/abelrguezr/hacktricks-skills
manifest: skills/stego/malware-and-network/malware-and-network/SKILL.MD
source content

Steganography Detection & Analysis

A skill for detecting and analyzing steganographic payloads in files, with focus on practical malware patterns rather than academic pixel-level techniques.

When to use this skill

Use this skill when:

  • Investigating suspicious files that might contain hidden data
  • Analyzing images or documents that could be carrying payloads
  • Hunting for steganographic techniques in malware samples
  • Triage of files downloaded from suspicious sources
  • Forensic analysis of potentially compromised files
  • Looking for marker-delimited payloads in images
  • Checking for data hidden in metadata or trailing bytes

Detection patterns

1. Marker-delimited payloads in images

Commodity malware often hides Base64 payloads as plain text inside valid images using unique marker strings.

Detection approach:

  • Scan downloaded images for delimiter strings
  • Look for scripts that fetch images and immediately call Base64 decoding
  • Check for HTTP content-type mismatches (image/* response with ASCII/Base64 body)

Common marker patterns:

  • Custom delimiters like
    <<marker>>
    ,
    ###START###
    ,
    BEGIN_PAYLOAD
  • Base64 blocks surrounded by unique strings
  • Text embedded in image metadata or trailing sections

2. Metadata hiding

Payloads hidden in file metadata are faster to detect than pixel-level steganography.

Check these locations:

  • EXIF/XMP/IPTC in JPEG/TIFF
  • PNG text chunks:
    tEXt
    ,
    iTXt
    ,
    zTXt
  • JPEG segments:
    COM
    ,
    APPn
  • Document metadata: Office files, PDF metadata

3. Trailing bytes

Data appended after the formal file end marker.

Examples:

  • Data after PNG
    IEND
    chunk
  • Data after JPEG
    FF D9
    marker
  • Data after ZIP end-of-central-directory

4. Embedded archives

ZIP/7z archives embedded or appended to files.

5. Polyglots

Files crafted to be valid under multiple parsers (image + script + archive).

Triage workflow

Step 1: Basic file identification

file sample
exiftool -a -u -g1 sample
strings -n 8 sample | head
binwalk sample
binwalk -e sample

Step 2: Check for marker-delimited payloads

Use the

detect_markers.py
script to scan for common delimiter patterns.

Step 3: Extract and analyze metadata

Use

extract_metadata.py
to pull all metadata from the file.

Step 4: Check trailing bytes

Use

check_trailing.py
to identify data after file end markers.

Step 5: Look for embedded archives

binwalk -e sample
zipinfo sample 2>/dev/null
7z l sample 2>/dev/null

Detection scripts

scripts/detect_markers.py

Scans files for marker-delimited Base64 payloads. Usage:

python scripts/detect_markers.py <file> [--markers "start_marker,end_marker"]

scripts/extract_metadata.py

Extracts all metadata from images and documents. Usage:

python scripts/extract_metadata.py <file>

scripts/check_trailing.py

Checks for data after file end markers. Usage:

python scripts/check_trailing.py <file>

ATT&CK mapping

  • T1027.003: Steganography - Obfuscated Files or Information
  • T1566: Phishing (delivery via steganographic images)
  • T1059: Command and Scripting Interpreter (PowerShell stagers)

Hunting queries

PowerShell detection

Look for scripts that:

  • Download images over HTTP(S)
  • Immediately call Base64 decoding (
    FromBase64String
    ,
    atob
    )
  • Load assemblies from decoded data

Network detection

  • HTTP responses with
    Content-Type: image/*
    but body contains long ASCII/Base64
  • Unusual image sizes (very large for the apparent content)
  • Images with high entropy in non-image sections

References

Example analysis

Scenario: Suspicious GIF downloaded from phishing email

  1. Run
    file suspicious.gif
    - confirms valid GIF
  2. Run
    strings -n 8 suspicious.gif
    - reveals
    <<sudo_png>>
    marker
  3. Run
    python scripts/detect_markers.py suspicious.gif
    - extracts Base64 payload
  4. Decode payload and analyze with
    file
    or
    yara
  5. Check metadata with
    exiftool
    for additional indicators

Best practices

  1. Always work with copies - never modify original evidence
  2. Document everything - save all outputs and timestamps
  3. Use multiple tools - different tools catch different patterns
  4. Check both content and metadata - payloads hide in both places
  5. Consider the context - why was this file downloaded? What's the threat model?
  6. Automate where possible - use scripts for repetitive checks
  7. Stay updated - new steganographic techniques emerge regularly