Variant · Low-Medium

CWE-180: Incorrect Behavior Order: Validate Before Canonicalize

The product validates input before it is canonicalized, which prevents the product from detecting data that becomes invalid after the canonicalization step.

CWE-180 · Variant Level ·5 CVEs ·1 Mitigations

Description

The product validates input before it is canonicalized, which prevents the product from detecting data that becomes invalid after the canonicalization step.

This can be used by an attacker to bypass the validation and launch attacks that expose weaknesses that would otherwise be prevented, such as injection.

Potential Impact

Access Control

Bypass Protection Mechanism

Demonstrative Examples

The following code attempts to validate a given input path by checking it against an allowlist and then return the canonical path. In this specific case, the path is considered valid if it starts with the string "/safe_dir/".
Bad
String path = getInputPath();if (path.startsWith("/safe_dir/")){File f = new File(path);return f.getCanonicalPath();}
The problem with the above code is that the validation step occurs before canonicalization occurs. An attacker could provide an input path of "/safe_dir/../" that would pass the validation step. However, the canonicalization process sees the double dot as a traversal to the parent directory and hence when canonicized the path would become just "/".
To avoid this problem, validation should occur after canonicalization takes place. In this case canonicalization occurs during the initialization of the File object. The code below fixes the issue.
Good
String path = getInputPath();File f = new File(path);if (f.getCanonicalPath().startsWith("/safe_dir/")){return f.getCanonicalPath();}

Mitigations & Prevention

Implementation

Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

Real-World CVE Examples

CVE IDDescription
CVE-2002-0433List files in web server using "*.ext"
CVE-2003-0332Product modifies the first two letters of a filename extension after performing a security check, which allows remote attackers to bypass authentication via a filename with a .ats extension instead of
CVE-2002-0802Database consumes an extra character when processing a character that cannot be converted, which could remove an escape character from the query and make the application subject to SQL injection attac
CVE-2000-0191Overlaps "fakechild/../realchild"
CVE-2004-2363Product checks URI for "<" and other literal characters, but does it before hex decoding the URI, so "%3E" and other sequences are allowed.

Taxonomy Mappings

  • PLOVER: — Validate-Before-Canonicalize
  • OWASP Top Ten 2004: A1 — Unvalidated Input
  • The CERT Oracle Secure Coding Standard for Java (2011): IDS01-J — Normalize strings before validating them
  • SEI CERT Oracle Coding Standard for Java: IDS01-J — Normalize strings before validating them

Frequently Asked Questions

What is CWE-180?

CWE-180 (Incorrect Behavior Order: Validate Before Canonicalize) is a software weakness identified by MITRE's Common Weakness Enumeration. It is classified as a Variant-level weakness. The product validates input before it is canonicalized, which prevents the product from detecting data that becomes invalid after the canonicalization step.

How can CWE-180 be exploited?

Attackers can exploit CWE-180 (Incorrect Behavior Order: Validate Before Canonicalize) to bypass protection mechanism. This weakness is typically introduced during the Implementation phase of software development.

How do I prevent CWE-180?

Key mitigations include: Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (C

What is the severity of CWE-180?

CWE-180 is classified as a Variant-level weakness (Low-Medium abstraction). It has been observed in 5 real-world CVEs.