Harness-engineering security-deserialization-attacks

Deserialization Attacks

install

source · Clone the upstream repo

git clone https://github.com/Intense-Visions/harness-engineering

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/claude-code/security-deserialization-attacks" ~/.claude/skills/intense-visions-harness-engineering-security-deserialization-attacks-d4dc5d && rm -rf "$T"

manifest: agents/skills/claude-code/security-deserialization-attacks/SKILL.md

source content

Deserialization Attacks

Deserialization reconstructs objects from byte streams -- and in most languages, that reconstruction executes code, meaning an attacker who controls the serialized input controls what code runs on your server

When to Use

Reviewing code that deserializes data from untrusted sources (user input, network, files)
Evaluating the security of data interchange formats in an application architecture
Assessing whether Java
```
ObjectInputStream
```
, Python
```
pickle
```
, PHP
```
unserialize()
```
, Ruby
```
Marshal.load()
```
, or YAML
```
load()
```
are used with external data
Designing APIs or message formats and choosing between serialization approaches
Investigating suspicious RCE (remote code execution) vectors in a web application
Auditing session management that stores serialized objects in cookies or URL parameters

Threat Context

Insecure deserialization consistently ranks among the most severe vulnerability classes because it directly enables remote code execution -- the attacker does not need to find a separate code injection point; the deserialization mechanism itself executes arbitrary code:

Equifax breach (2017, CVE-2017-5638): An Apache Struts vulnerability allowed remote code execution through a crafted
```
Content-Type
```
header that triggered OGNL expression evaluation during deserialization. The attacker gained shell access to servers containing 145.5 million Americans' personal data. The vulnerability had a patch available two months before the breach; Equifax failed to apply it.
Apache Commons Collections (2015): Security researchers published a gadget chain in Apache Commons Collections that achieved arbitrary command execution through Java deserialization. Because Commons Collections was on the classpath of virtually every Java enterprise application (including WebLogic, JBoss, Jenkins, and WebSphere), this single disclosure made thousands of production systems immediately exploitable.
Jenkins (CVE-2016-0792): A deserialization vulnerability in Jenkins allowed unauthenticated remote code execution. The exploit used a Groovy-based gadget chain to execute operating system commands. Jenkins instances exposed to the internet were compromised en masse for cryptocurrency mining and as pivot points into internal networks.
PHP object injection: PHP's
```
unserialize()
```
reconstructs objects and invokes magic methods (
```
__wakeup()
```
,
```
__destruct()
```
,
```
__toString()
```
). Attackers chain these magic methods across classes in the application and its dependencies to achieve file writes, SQL injection, or remote code execution. WordPress plugins have been repeatedly vulnerable to this pattern.

Instructions

Understand why deserialization is inherently dangerous. Native serialization formats (Java
```
Serializable
```
, Python
```
pickle
```
, PHP
```
serialize
```
, Ruby
```
Marshal
```
, .NET
```
BinaryFormatter
```
) are not data formats -- they are code formats. Deserializing an object means: allocating memory for the object, setting its fields to attacker-controlled values, and executing lifecycle methods (constructors, finalizers, magic methods). The attacker does not need to inject new code -- they manipulate existing classes in the application's classpath to form a chain of method calls that achieves their objective. This is the gadget chain concept.
Never deserialize untrusted input with native serialization. This is the primary rule and it has no exceptions:
- Java: Do not use
```
ObjectInputStream.readObject()
```
  on data from users, APIs, message queues, cookies, or any source the attacker can influence. Replace with JSON (Jackson, Gson) or Protocol Buffers. If
```
ObjectInputStream
```
  cannot be removed, use
```
ObjectInputFilter
```
  (Java 9+) to allowlist specific classes before deserialization begins.
- Python: Do not use
```
pickle.loads()
```
  on untrusted data. The pickle protocol executes arbitrary Python bytecode during deserialization -- there is no safe way to use pickle with untrusted input. Use JSON (
```
json.loads()
```
  ), MessagePack, or Protocol Buffers.
- PHP: Do not use
```
unserialize()
```
  on user-controlled input. Use
```
json_decode()
```
  . If
```
unserialize()
```
  is required, PHP 7+ supports an
```
allowed_classes
```
  option that restricts which classes can be instantiated -- use it with an explicit allowlist, never
```
true
```
  .
- Ruby: Do not use
```
Marshal.load()
```
  on untrusted data. Use JSON (
```
JSON.parse()
```
  ) or MessagePack.
- YAML: In Python, use
```
yaml.safe_load()
```
  , never
```
yaml.load()
```
  . In Ruby, use
```
YAML.safe_load()
```
  , never
```
YAML.load()
```
  (Ruby 3.1+ changed the default, but explicit
```
safe_load
```
  is still preferred for clarity). The unsafe YAML loaders instantiate arbitrary classes from YAML tags (
```
!!python/object/apply:os.system
```
  ).
- .NET: Do not use
```
BinaryFormatter
```
  ,
```
SoapFormatter
```
  ,
```
NetDataContractSerializer
```
  , or
```
ObjectStateFormatter
```
  . Microsoft has deprecated
```
BinaryFormatter
```
  as of .NET 8 due to its inherent insecurity. Use
```
System.Text.Json
```
  or Protocol Buffers.
Use data-only interchange formats. JSON, Protocol Buffers, MessagePack, FlatBuffers, and similar formats serialize data (strings, numbers, arrays, maps) without encoding type information or executable behavior. Deserializing JSON cannot instantiate objects or execute methods -- it produces primitive data structures that the application then validates and maps to domain objects explicitly. This separation between data parsing and object construction is the fundamental defense against deserialization attacks.
When native serialization cannot be removed, apply defense in depth:
- Allowlist classes before deserialization. Java's
```
ObjectInputFilter
```
  (JEP 290) lets you specify exactly which classes may be deserialized. Reject everything not on the list. This must happen before
```
readObject()
```
  is called -- once deserialization begins, gadget chain code has already executed.
- Verify integrity before deserialization. Sign serialized payloads with HMAC. Before deserializing, verify the signature. If the attacker cannot forge the signature, they cannot inject a malicious payload. This requires the signing key to remain secret.
- Isolate deserialization in a sandbox. Run deserialization in a restricted process with minimal permissions, seccomp filters, or a container with no network access. If the gadget chain achieves code execution, the sandbox limits the damage.
- Monitor and alert on deserialization failures. Deserialization exceptions (unexpected class, invalid stream) often indicate an exploitation attempt. Log these failures with full context and alert on anomalous patterns.
Audit the codebase for deserialization points. Search for all uses of native deserialization APIs and categorize each by trust level of the input:
- ```
ObjectInputStream
```
  (Java),
```
pickle.loads
```
  /
```
pickle.load
```
  (Python),
```
unserialize
```
  (PHP),
```
Marshal.load
```
  (Ruby),
```
yaml.load
```
  (Python/Ruby),
```
BinaryFormatter.Deserialize
```
  (.NET),
```
readObject
```
  (Java),
```
XMLDecoder
```
  (Java)
- For each occurrence: is the input from a trusted source (internal-only, signed)? If not, replace with a safe alternative or add allowlisting and integrity verification.

Details

How gadget chains work -- the Java example in detail: The attacker's goal is to execute
```
Runtime.getRuntime().exec("malicious command")
```
. But
```
Runtime
```
is not
```
Serializable
```
, so it cannot appear directly in the stream. Instead, the attacker finds a chain of serializable classes where: Class A's
```
readObject()
```
calls a method on a field. That field is set to an instance of Class B, whose method calls another method on another field. This chain continues until a class in the chain invokes
```
Runtime.exec()
```
or equivalent through reflection,
```
ProcessBuilder
```
, or
```
ScriptEngine
```
. The Apache Commons Collections gadget chain uses
```
InvokerTransformer
```
(which calls any method via reflection) wired through
```
ChainedTransformer
```
and triggered by
```
LazyMap
```
during
```
readObject()
```
. The attacker crafts a serialized object graph that arranges these classes in the right configuration. Tools like ysoserial automate gadget chain construction for known library combinations.
Python pickle as a universal code execution primitive: The pickle protocol includes an opcode (
```
REDUCE
```
) that calls any callable with any arguments. A malicious pickle payload can contain
```
os.system("rm -rf /")
```
or equivalent. Unlike Java gadget chains, which require finding suitable classes on the classpath, pickle exploitation is trivial because the protocol itself supports arbitrary function calls. There is no safe subset of pickle.
```
pickle.loads(untrusted_data)
```
is equivalent to
```
eval(untrusted_data)
```
.
Deserialization in session management and view state: Some web frameworks store session state by serializing objects, encoding them (often Base64), and placing them in cookies or hidden form fields. If the serialized data is not signed or encrypted, the attacker modifies the cookie to inject a malicious serialized object. ASP.NET ViewState, Java JSF ViewState, and PHP session serialization have all been exploited this way. The mitigation is to sign and encrypt serialized session data (ASP.NET's
```
machineKey
```
or a modern equivalent) and to verify the signature before deserialization.
YAML deserialization as a code execution vector: YAML supports language-specific type tags that instruct the parser to instantiate objects. In Python,
```
!!python/object/apply:os.system ["id"]
```
executes the
```
id
```
command. In Ruby,
```
!!ruby/object:Gem::Installer
```
has been used to achieve code execution. The
```
safe_load
```
variants restrict the parser to basic data types (strings, numbers, lists, maps) and reject type tags. Any YAML parser configured to support arbitrary type tags is a deserialization vulnerability.

Anti-Patterns

Deserializing user input with native serialization formats. Accepting serialized Java objects in an HTTP request body, a message queue, or a cookie. The application may validate the deserialized object after construction, but by then the gadget chain has already executed. Deserialization attacks happen during deserialization, not after. Use JSON or Protocol Buffers for all external data interchange.
Allowlisting after deserialization. Checking the class of the deserialized object after
```
readObject()
```
returns. This is too late. The gadget chain executes inside
```
readObject()
```
before control returns to the caller. Allowlisting must happen before deserialization, using
```
ObjectInputFilter
```
in Java or equivalent mechanisms that reject disallowed classes before their
```
readObject()
```
methods execute.
Using
```
yaml.load()
```
instead of
yaml.safe_load()
. In Python's PyYAML library,
```
yaml.load()
```
without a
```
Loader
```
argument (or with
```
Loader=yaml.FullLoader
```
prior to PyYAML 6.0) can instantiate arbitrary Python objects from YAML tags. A YAML configuration file that an attacker can modify becomes a remote code execution vector.
```
yaml.safe_load()
```
restricts parsing to basic data types and is safe for untrusted input.
Trusting serialized objects in cookies or URL parameters. Storing a serialized user object in a cookie without signing it, then deserializing it on each request. The attacker modifies the cookie to contain a gadget chain payload. Even Base64 encoding provides no protection -- it is encoding, not encryption or signing. If serialized data must travel through the client, sign it with HMAC and verify the signature before deserialization.
Filtering known gadget chains instead of fixing the root cause. Maintaining a blocklist of known dangerous classes (e.g., blocking
```
InvokerTransformer
```
in Java deserialization filters). New gadget chains are discovered regularly as new libraries are added to the classpath. A blocklist is always incomplete. Use an allowlist of the specific classes the application legitimately deserializes, or replace native serialization entirely.
Assuming JSON is always safe. While JSON itself is a data-only format, JSON parsing libraries that support polymorphic type handling can reintroduce deserialization vulnerabilities. Jackson's
```
@JsonTypeInfo
```
with
```
JsonTypeInfo.Id.CLASS
```
allows the JSON payload to specify which Java class to instantiate, enabling gadget chain attacks through JSON. Use
```
JsonTypeInfo.Id.NAME
```
with an explicit type allowlist, or avoid polymorphic deserialization entirely.