Base · Medium

CWE-502: Deserialization of Untrusted Data

The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.

CWE-502 · Base Level ·13 CVEs ·7 Mitigations

Description

The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.

Insecure Deserialization Guide

Read our in-depth guide on exploiting and mitigating this weakness

Potential Impact

Integrity

Modify Application Data, Unexpected State

Availability

DoS: Resource Consumption (CPU)

Other

Varies by Context

Demonstrative Examples

This code snippet deserializes an object from a file and uses it as a UI button:
Bad
try {File file = new File("object.obj");ObjectInputStream in = new ObjectInputStream(new FileInputStream(file));javax.swing.JButton button = (javax.swing.JButton) in.readObject();in.close();}
This code does not attempt to verify the source or contents of the file before deserializing it. An attacker may be able to replace the intended file with a file that contains arbitrary malicious code which will be executed when the button is pressed.
To mitigate this, explicitly define final readObject() to prevent deserialization. An example of this is:
Good
private final void readObject(ObjectInputStream in) throws java.io.IOException {throw new java.io.IOException("Cannot be deserialized"); }
In Python, the Pickle library handles the serialization and deserialization processes. In this example derived from [REF-467], the code receives and parses data, and afterwards tries to authenticate a user based on validating a token.
Bad
try {
                        class ExampleProtocol(protocol.Protocol):def dataReceived(self, data):
                           # Code that would be here would parse the incoming data# After receiving headers, call confirmAuth() to authenticate
                           def confirmAuth(self, headers):try:token = cPickle.loads(base64.b64decode(headers['AuthToken']))if not check_hmac(token['signature'], token['data'], getSecretKey()):raise AuthFailself.secure_data = token['data']except:raise AuthFail
                     }
Unfortunately, the code does not verify that the incoming data is legitimate. An attacker can construct a illegitimate, serialized object "AuthToken" that instantiates one of Python's subprocesses to execute arbitrary commands. For instance,the attacker could construct a pickle that leverages Python's subprocess module, which spawns new processes and includes a number of arguments for various uses. Since Pickle allows objects to define the process for how they should be unpickled, the attacker can direct the unpickle process to call Popen in the subprocess module and execute /bin/sh.

Mitigations & Prevention

Architecture and DesignImplementation

If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be used to ensure that data has not been modified.

Implementation

When deserializing data, populate a new object rather than just deserializing. The result is that the data flows through safe input validation and that the functions are safe.

Implementation

Explicitly define a final object() to prevent deserialization.

Architecture and DesignImplementation

Make fields transient to protect them from deserialization. An attempt to serialize and then deserialize a class containing transient fields will result in NULLs where the transient data should be. This is an excellent way to prevent time, environment-based, or sensitive variables from being carried over and used improperly.

Implementation

Avoid having unnecessary types or gadgets (a sequence of instances and method invocations that can self-execute during the deserialization process, often found in libraries) available that can be leveraged for malicious ends. This limits the potential for unintended or unauthorized types and gadgets to be leveraged by the attacker. Add only acceptable classes to an allowlist. Note: new gadgets are constantly being discovered, so this alone is not a sufficient mitigation.

Architecture and DesignImplementation

Employ cryptography of the data or code for protection. However, it's important to note that it would still be client-side security. This is risky because if the client is compromised then the security implemented on the client (the cryptography) can be bypassed.

Operation Moderate

Use an application firewall that can detect attacks against this weakness. It can be beneficial in cases in which the code cannot be fixed (because it is controlled by a third party), as an emergency prevention measure while more comprehensive software assurance measures are applied, or to provide defense in depth [REF-1481].

Detection Methods

  • Automated Static Analysis High — Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then sea

Real-World CVE Examples

CVE IDDescription
CVE-2024-37052insecure deserialization in platform for managing AI/ML applications and models allows code execution via a crafted pickled object in a model file
CVE-2024-37288deserialization of untrusted YAML data in dashboard for data query and visualization of Elasticsearch data
CVE-2024-9314PHP object injection in WordPress plugin for AI-based SEO
CVE-2019-12799chain: bypass of untrusted deserialization issue (CWE-502) by using an assumed-trusted class (CWE-183)
CVE-2015-8103Deserialization issue in commonly-used Java library allows remote execution.
CVE-2015-4852Deserialization issue in commonly-used Java library allows remote execution.
CVE-2013-1465Use of PHP unserialize function on untrusted input allows attacker to modify application configuration.
CVE-2012-3527Use of PHP unserialize function on untrusted input in content management system might allow code execution.
CVE-2012-0911Use of PHP unserialize function on untrusted input in content management system allows code execution using a crafted cookie value.
CVE-2012-0911Content management system written in PHP allows unserialize of arbitrary objects, possibly allowing code execution.
CVE-2011-2520Python script allows local users to execute code via pickled data.
CVE-2012-4406Unsafe deserialization using pickle in a Python script.
CVE-2003-0791Web browser allows execution of native methods via a crafted string to a JavaScript function that deserializes the string.

Taxonomy Mappings

  • CLASP: — Deserialization of untrusted data
  • The CERT Oracle Secure Coding Standard for Java (2011): SER01-J — Do not deviate from the proper signatures of serialization methods
  • The CERT Oracle Secure Coding Standard for Java (2011): SER03-J — Do not serialize unencrypted, sensitive data
  • The CERT Oracle Secure Coding Standard for Java (2011): SER06-J — Make defensive copies of private mutable components during deserialization
  • The CERT Oracle Secure Coding Standard for Java (2011): SER08-J — Do not use the default serialized form for implementation defined invariants
  • Software Fault Patterns: SFP25 — Tainted input to variable

Frequently Asked Questions

What is CWE-502?

CWE-502 (Deserialization of Untrusted Data) is a software weakness identified by MITRE's Common Weakness Enumeration. It is classified as a Base-level weakness. The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.

How can CWE-502 be exploited?

Attackers can exploit CWE-502 (Deserialization of Untrusted Data) to modify application data, unexpected state. This weakness is typically introduced during the Architecture and Design, Implementation phase of software development.

How do I prevent CWE-502?

Key mitigations include: If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be

What is the severity of CWE-502?

CWE-502 is classified as a Base-level weakness (Medium abstraction). It has been observed in 13 real-world CVEs.