CWE-185: Incorrect Regular Expression

Description

The product specifies a regular expression in a way that causes data to be improperly matched or compared.

When the regular expression is used in protection mechanisms such as filtering or validation, this may allow an attacker to bypass the intended restrictions on the incoming data.

Potential Impact

Other

Unexpected State, Varies by Context

Access Control

Bypass Protection Mechanism

Demonstrative Examples

The following code takes phone numbers as input, and uses a regular expression to reject invalid phone numbers.

Bad

$phone = GetPhoneNumber();if ($phone =~ /\d+-\d+/) {
                        # looks like it only has hyphens and digits
                        system("lookup-phone $phone");}
				  else {error("malformed number!");}

An attacker could provide an argument such as: "; ls -l ; echo 123-456" This would pass the check, since "123-456" is sufficient to match the "\d+-\d+" portion of the regular expression.

This code uses a regular expression to validate an IP string prior to using it in a call to the "ping" command.

Bad

import subprocess
		  import re
		  
		  def validate_ip_regex(ip: str):
		  
		    ip_validator = re.compile(r"((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}")
		    if ip_validator.match(ip):
		    
		      return ip
		    
		    else:
		    
		      raise ValueError("IP address does not match valid pattern.")
		    
		  
		  
		  def run_ping_regex(ip: str):
		  
		    validated = validate_ip_regex(ip)
		    # The ping command treats zero-prepended IP addresses as octal
		    result = subprocess.call(["ping", validated])
		    print(result)

Since the regular expression does not have anchors (CWE-777), i.e. is unbounded without ^ or $ characters, then prepending a 0 or 0x to the beginning of the IP address will still result in a matched regex pattern. Since the ping command supports octal and hex prepended IP addresses, it will use the unexpectedly valid IP address (CWE-1389). For example, "0x63.63.63.63" would be considered equivalent to "99.63.63.63". As a result, the attacker could potentially ping systems that the attacker cannot reach directly.

Mitigations & Prevention

Implementation

Regular expressions can become error prone when defining a complex language even for those experienced in writing grammars. Determine if several smaller regular expressions simplify one large regular expression. Also, subject the regular expression to thorough testing techniques such as equivalence partitioning, boundary value analysis, and robustness. After testing and a reasonable confidence level is achieved, a regular expression may not be foolproof. If an exploit is allowed to slip through,

Detection Methods

Automated Static Analysis High — Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then sea

Real-World CVE Examples

CVE ID	Description
CVE-2002-2109	Regexp isn't "anchored" to the beginning or end, which allows spoofed values that have trusted values as substrings.
CVE-2005-1949	Regexp for IP address isn't anchored at the end, allowing appending of shell metacharacters.
CVE-2001-1072	Bypass access restrictions via multiple leading slash, which causes a regular expression to fail.
CVE-2000-0115	Local user DoS via invalid regular expressions.
CVE-2002-1527	chain: Malformed input generates a regular expression error that leads to information exposure.
CVE-2005-1061	Certain strings are later used in a regexp, leading to a resultant crash.
CVE-2005-2169	MFV. Regular expression intended to protect against directory traversal reduces ".../...//" to "../".
CVE-2005-0603	Malformed regexp syntax leads to information exposure in error message.
CVE-2005-1820	Code injection due to improper quoting of regular expression.
CVE-2005-3153	Null byte bypasses PHP regexp check.
CVE-2005-4155	Null byte bypasses PHP regexp check.

Taxonomy Mappings

PLOVER: — Regular Expression Error

Frequently Asked Questions

What is CWE-185?

CWE-185 (Incorrect Regular Expression) is a software weakness identified by MITRE's Common Weakness Enumeration. It is classified as a Class-level weakness. The product specifies a regular expression in a way that causes data to be improperly matched or compared.

How can CWE-185 be exploited?

Attackers can exploit CWE-185 (Incorrect Regular Expression) to unexpected state, varies by context. This weakness is typically introduced during the Implementation phase of software development.

How do I prevent CWE-185?

Key mitigations include: Regular expressions can become error prone when defining a complex language even for those experienced in writing grammars. Determine if several smaller regular expressions simplify one large regular

What is the severity of CWE-185?

CWE-185 is classified as a Class-level weakness (High abstraction). It has been observed in 11 real-world CVEs.