Asi extracting-iocs-from-malware-samples
git clone https://github.com/plurigrid/asi
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/asi/skills/extracting-iocs-from-malware-samples" ~/.claude/skills/plurigrid-asi-extracting-iocs-from-malware-samples && rm -rf "$T"
plugins/asi/skills/extracting-iocs-from-malware-samples/SKILL.mdExtracting IOCs from Malware Samples
When to Use
- A malware analysis (static or dynamic) is complete and actionable indicators need to be extracted for defense teams
- Building blocklists for firewalls, proxies, and DNS sinkholes from analyzed samples
- Creating YARA rules, Snort/Suricata signatures, or SIEM detection content from malware artifacts
- Contributing to threat intelligence sharing platforms (MISP, OTX, ThreatConnect)
- Tracking malware campaigns by correlating IOCs across multiple samples
Do not use for IOCs from unverified sources without validation; false positives in blocklists can disrupt legitimate business operations.
Prerequisites
- Python 3.8+ with
,iocextract
,pefile
libraries installedyara-python - Completed malware analysis report (static analysis, dynamic analysis, or reverse engineering)
- Access to PCAP files, memory dumps, or sandbox reports from the analysis
- MISP instance or STIX/TAXII server for structured IOC sharing
- VirusTotal API key for IOC enrichment and validation
- CyberChef for decoding obfuscated indicators
Workflow
Step 1: Extract File-Based IOCs
Compute hashes and identify file metadata indicators:
# Generate all standard hashes md5sum malware_sample.exe sha1sum malware_sample.exe sha256sum malware_sample.exe # Generate ssdeep fuzzy hash for similarity matching ssdeep malware_sample.exe # Generate imphash (import hash) for PE files python3 -c " import pefile pe = pefile.PE('malware_sample.exe') print(f'Imphash: {pe.get_imphash()}') " # Generate TLSH (Trend Micro Locality Sensitive Hash) python3 -c " import tlsh with open('malware_sample.exe', 'rb') as f: h = tlsh.hash(f.read()) print(f'TLSH: {h}') " # Compile file metadata IOCs python3 << 'PYEOF' import pefile import os import hashlib import datetime pe = pefile.PE("malware_sample.exe") print("FILE IOCs:") with open("malware_sample.exe", "rb") as f: data = f.read() print(f" MD5: {hashlib.md5(data).hexdigest()}") print(f" SHA-1: {hashlib.sha1(data).hexdigest()}") print(f" SHA-256: {hashlib.sha256(data).hexdigest()}") print(f" File Size: {len(data)} bytes") ts = pe.FILE_HEADER.TimeDateStamp print(f" Compile: {datetime.datetime.utcfromtimestamp(ts)} UTC") print(f" Imphash: {pe.get_imphash()}") PYEOF
Step 2: Extract Network IOCs
Pull network indicators from strings, PCAP, and sandbox reports:
# Extract network IOCs from strings import re with open("malware_sample.exe", "rb") as f: data = f.read() # Extract ASCII and Unicode strings ascii_strings = re.findall(b'[ -~]{4,}', data) unicode_strings = re.findall(b'(?:[ -~]\x00){4,}', data) all_strings = [s.decode('ascii', errors='ignore') for s in ascii_strings] all_strings += [s.decode('utf-16-le', errors='ignore') for s in unicode_strings] # IP addresses (excluding private ranges for C2 indicators) ip_pattern = re.compile(r'\b(?:(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\.){3}(?:25[0-5]|2[0-4]\d|1\d{2}|[1-9]?\d)\b') ips = set() for s in all_strings: for ip in ip_pattern.findall(s): # Filter out private/reserved ranges octets = [int(o) for o in ip.split('.')] if octets[0] not in [10, 127, 0] and not (octets[0] == 172 and 16 <= octets[1] <= 31) and not (octets[0] == 192 and octets[1] == 168): ips.add(ip) # Domain names domain_pattern = re.compile(r'\b[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z]{2,})+\b') domains = set() for s in all_strings: for d in domain_pattern.findall(s): if not d.endswith(('.dll', '.exe', '.sys', '.com.au')): domains.add(d) # URLs url_pattern = re.compile(r'https?://[^\s<>"{}|\\^`\[\]]+') urls = set() for s in all_strings: for u in url_pattern.findall(s): urls.add(u) print("NETWORK IOCs:") print(f" IPs: {ips}") print(f" Domains: {domains}") print(f" URLs: {urls}")
Step 3: Extract Host-Based IOCs
Identify file paths, registry keys, mutexes, and services:
# Extract host-based IOCs from sandbox report import json with open("cuckoo_report.json") as f: report = json.load(f) print("HOST IOCs:") # File paths created or modified print("\nFile Paths:") for f in report["behavior"]["summary"].get("files", []): if any(p in f.lower() for p in ["temp", "appdata", "system32", "programdata"]): print(f" [DROPPED] {f}") # Registry keys for persistence print("\nRegistry Keys:") for key in report["behavior"]["summary"].get("write_keys", []): if any(p in key.lower() for p in ["run", "service", "startup", "shell"]): print(f" [PERSIST] {key}") # Mutexes (unique to malware family) print("\nMutexes:") for mutex in report["behavior"]["summary"].get("mutexes", []): if mutex not in ["Local\\!IETld!Mutex", "RasPbFile"]: # Filter known Windows mutexes print(f" [MUTEX] {mutex}") # Created services print("\nServices:") for svc in report["behavior"]["summary"].get("started_services", []): print(f" [SERVICE] {svc}")
Step 4: Extract Network IOCs from PCAP
Parse network captures for additional indicators:
# Extract DNS queries from PCAP tshark -r capture.pcap -T fields -e dns.qry.name -Y "dns.flags.response == 0" | sort -u # Extract HTTP hosts and URLs tshark -r capture.pcap -T fields -e http.host -e http.request.uri -Y "http.request" | sort -u # Extract TLS server names (SNI) tshark -r capture.pcap -T fields -e tls.handshake.extensions_server_name -Y "tls.handshake.type == 1" | sort -u # Extract JA3 hashes tshark -r capture.pcap -T fields -e tls.handshake.ja3 -Y "tls.handshake.type == 1" | sort -u # Extract unique destination IPs tshark -r capture.pcap -T fields -e ip.dst -Y "ip.src == 10.0.2.15" | sort -u # Extract User-Agent strings tshark -r capture.pcap -T fields -e http.user_agent -Y "http.user_agent" | sort -u
Step 5: Defang and Validate IOCs
Defang indicators for safe sharing and validate against threat intelligence:
# Defang IOCs for safe sharing def defang_ip(ip): return ip.replace(".", "[.]") def defang_url(url): return url.replace("http", "hxxp").replace(".", "[.]") def defang_domain(domain): return domain.replace(".", "[.]") # Validate IOCs against VirusTotal import requests VT_API_KEY = "your_api_key" def check_vt_ip(ip): resp = requests.get(f"https://www.virustotal.com/api/v3/ip_addresses/{ip}", headers={"x-apikey": VT_API_KEY}) data = resp.json() stats = data["data"]["attributes"]["last_analysis_stats"] return stats["malicious"] def check_vt_domain(domain): resp = requests.get(f"https://www.virustotal.com/api/v3/domains/{domain}", headers={"x-apikey": VT_API_KEY}) data = resp.json() stats = data["data"]["attributes"]["last_analysis_stats"] return stats["malicious"] # Validate each IOC for ip in ips: detections = check_vt_ip(ip) print(f" {defang_ip(ip)} - VT: {detections} detections")
Step 6: Export IOCs in Standard Formats
Generate structured IOC outputs for sharing and ingestion:
# Export as STIX 2.1 bundle from stix2 import Indicator, Bundle, Malware, Relationship import datetime indicators = [] # File hash indicator indicators.append(Indicator( name="Malware SHA-256 Hash", pattern=f"[file:hashes.'SHA-256' = '{sha256_hash}']", pattern_type="stix", valid_from=datetime.datetime.now(datetime.timezone.utc), labels=["malicious-activity"] )) # IP indicator for ip in ips: indicators.append(Indicator( name=f"C2 IP Address {ip}", pattern=f"[ipv4-addr:value = '{ip}']", pattern_type="stix", valid_from=datetime.datetime.now(datetime.timezone.utc), labels=["malicious-activity"] )) # Domain indicator for domain in domains: indicators.append(Indicator( name=f"C2 Domain {domain}", pattern=f"[domain-name:value = '{domain}']", pattern_type="stix", valid_from=datetime.datetime.now(datetime.timezone.utc), labels=["malicious-activity"] )) bundle = Bundle(objects=indicators) with open("iocs_stix.json", "w") as f: f.write(bundle.serialize(pretty=True)) # Export as CSV for SIEM ingestion import csv with open("iocs.csv", "w", newline="") as f: writer = csv.writer(f) writer.writerow(["type", "value", "context", "confidence"]) writer.writerow(["sha256", sha256_hash, "malware_sample", "high"]) for ip in ips: writer.writerow(["ipv4", ip, "c2_server", "high"]) for domain in domains: writer.writerow(["domain", domain, "c2_domain", "high"]) for url in urls: writer.writerow(["url", url, "c2_url", "high"])
Key Concepts
| Term | Definition |
|---|---|
| IOC (Indicator of Compromise) | Forensic artifact observed in a network or system that indicates a potential intrusion: hashes, IPs, domains, file paths, registry keys |
| Defanging | Modifying IOCs to prevent accidental activation (e.g., replacing dots with [.] in URLs and IPs for safe sharing in reports) |
| Imphash | MD5 hash of the import table functions in a PE file; samples from the same malware family often share the same imphash |
| STIX/TAXII | Structured Threat Information Expression / Trusted Automated Exchange; standards for encoding and transmitting threat intelligence |
| JA3/JA3S | TLS client/server fingerprint based on ClientHello/ServerHello parameters; identifies specific malware families by their TLS implementation |
| Fuzzy Hashing (ssdeep) | Context-triggered piecewise hashing that identifies similar files even with minor modifications; useful for malware variant detection |
| MISP | Malware Information Sharing Platform; open-source threat intelligence platform for collecting, storing, and sharing IOCs |
Tools & Systems
- iocextract (Python): Automated IOC extraction library supporting IPs, URLs, domains, hashes, and YARA rules from text
- MISP: Open-source threat intelligence sharing platform for structured IOC management and distribution
- CyberChef: Web-based tool for decoding, decrypting, and transforming data useful for deobfuscating encoded IOCs
- tshark: Command-line network protocol analyzer for extracting network IOCs from PCAP files
- VirusTotal: Online service for validating and enriching IOCs with community detection results and threat intelligence
Common Scenarios
Scenario: Building a Comprehensive IOC Package from a Ransomware Sample
Context: A ransomware incident requires rapid IOC extraction for blocking across the enterprise while the full investigation continues. Multiple data sources are available: the sample binary, PCAP from network monitoring, and a Cuckoo sandbox report.
Approach:
- Compute all file hashes (MD5, SHA-1, SHA-256, imphash, ssdeep) for the ransomware binary and any dropped files
- Extract network IOCs from strings in the binary (hardcoded C2 addresses)
- Parse the PCAP for DNS queries, HTTP requests, and TLS SNI fields
- Extract host IOCs from the sandbox report (file paths, registry keys, mutexes, ransom note filenames)
- Validate all network IOCs against VirusTotal to confirm malicious status and check for known associations
- Defang all indicators and compile into STIX 2.1 format for sharing and CSV for SIEM ingestion
- Submit to MISP event for organizational and community sharing
Pitfalls:
- Including IP addresses of legitimate CDNs or cloud services without validating context (e.g., AWS IPs used for hosting, not inherently malicious)
- Not defanging URLs and IPs in reports, leading to accidental clicks or DNS resolution
- Extracting strings from packed binaries (IOCs from packed samples are unreliable; unpack first)
- Forgetting to include dropped file hashes (the initial dropper and the final payload are separate IOCs)
Output Format
IOC EXTRACTION REPORT ====================== Sample: ransomware.exe Analysis Date: 2025-09-15 Analyst: [Name] FILE INDICATORS SHA-256: e3b0c44298fc1c149afbf4c8996fb924... SHA-1: da39a3ee5e6b4b0d3255bfef95601890afd80709 MD5: d41d8cd98f00b204e9800998ecf8427e Imphash: a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6 ssdeep: 3072:kJh3bN7fY+aUkJh3bN7fY+aU:kJh3R7aUkJh3R7aU NETWORK INDICATORS C2 IPs: 185.220.101[.]42, 91.215.85[.]17 C2 Domains: update.malicious[.]com, backup.evil[.]net C2 URLs: hxxps://update.malicious[.]com/gate.php hxxps://backup.evil[.]net/gate.php JA3 Hash: a0e9f5d64349fb13191bc781f81f42e1 User-Agent: Mozilla/5.0 (compatible; MSIE 10.0) HOST INDICATORS File Paths: C:\Users\Public\svchost.exe C:\Users\%USER%\AppData\Local\Temp\payload.dll C:\Users\%USER%\Desktop\README_DECRYPT.txt Registry Keys: HKCU\Software\Microsoft\Windows\CurrentVersion\Run\WindowsUpdate Mutexes: Global\CryptLocker_2025_Q3 Services: FakeWindowsUpdate CONFIDENCE ASSESSMENT High Confidence: SHA-256, C2 IPs (validated via VT), Mutexes Medium Confidence: Domains (could be compromised legitimate sites) Low Confidence: User-Agent (common string, high false positive risk) EXPORT FILES stix_bundle.json - STIX 2.1 format for TIP ingestion iocs.csv - Flat CSV for SIEM blocklist import yara_rule.yar - YARA detection rule