Variant · Low-Medium

CWE-777: Regular Expression without Anchors

The product uses a regular expression to perform neutralization, but the regular expression is not anchored and may allow malicious or malformed data to slip through.

CWE-777 · Variant Level ·1 CVEs ·1 Mitigations

Description

The product uses a regular expression to perform neutralization, but the regular expression is not anchored and may allow malicious or malformed data to slip through.

When performing tasks such as validating against a set of allowed inputs (allowlist), data is examined and possibly modified to ensure that it is well-formed and adheres to a list of safe values. If the regular expression is not anchored, malicious or malformed data may be included before or after any string matching the regular expression. The type of malicious data that is allowed will depend on the context of the application and which anchors are omitted from the regular expression.

Potential Impact

Availability, Confidentiality, Access Control

Bypass Protection Mechanism

Demonstrative Examples

Consider a web application that supports multiple languages. It selects messages for an appropriate language by using the lang parameter.
Bad
$dir = "/home/cwe/languages";$lang = $_GET['lang'];if (preg_match("/[A-Za-z0-9]+/", $lang)) {include("$dir/$lang");}else {echo "You shall not pass!\n";}
The previous code attempts to match only alphanumeric values so that language values such as "english" and "french" are valid while also protecting against path traversal, CWE-22. However, the regular expression anchors are omitted, so any text containing at least one alphanumeric character will now pass the validation step. For example, the attack string below will match the regular expression.
Attack
../../etc/passwd
If the attacker can inject code sequences into a file, such as the web server's HTTP request log, then the attacker may be able to redirect the lang parameter to the log file and execute arbitrary code.
This code uses a regular expression to validate an IP string prior to using it in a call to the "ping" command.
Bad
import subprocess
		  import re
		  
		  def validate_ip_regex(ip: str):
		  
		    ip_validator = re.compile(r"((25[0-5]|(2[0-4]|1\d|[1-9]|)\d)\.?\b){4}")
		    if ip_validator.match(ip):
		    
		      return ip
		    
		    else:
		    
		      raise ValueError("IP address does not match valid pattern.")
		    
		  
		  
		  def run_ping_regex(ip: str):
		  
		    validated = validate_ip_regex(ip)
		    # The ping command treats zero-prepended IP addresses as octal
		    result = subprocess.call(["ping", validated])
		    print(result)
Since the regular expression does not have anchors (CWE-777), i.e. is unbounded without ^ or $ characters, then prepending a 0 or 0x to the beginning of the IP address will still result in a matched regex pattern. Since the ping command supports octal and hex prepended IP addresses, it will use the unexpectedly valid IP address (CWE-1389). For example, "0x63.63.63.63" would be considered equivalent to "99.63.63.63". As a result, the attacker could potentially ping systems that the attacker cannot reach directly.

Mitigations & Prevention

Implementation

Be sure to understand both what will be matched and what will not be matched by a regular expression. Anchoring the ends of the expression will allow the programmer to define an allowlist strictly limited to what is matched by the text in the regular expression. If you are using a package that only matches one line by default, ensure that you can match multi-line inputs if necessary.

Real-World CVE Examples

CVE IDDescription
CVE-2022-30034Chain: Web UI for a Python RPC framework does not use regex anchors to validate user login emails (CWE-777), potentially allowing bypass of OAuth (CWE-1390).

Frequently Asked Questions

What is CWE-777?

CWE-777 (Regular Expression without Anchors) is a software weakness identified by MITRE's Common Weakness Enumeration. It is classified as a Variant-level weakness. The product uses a regular expression to perform neutralization, but the regular expression is not anchored and may allow malicious or malformed data to slip through.

How can CWE-777 be exploited?

Attackers can exploit CWE-777 (Regular Expression without Anchors) to bypass protection mechanism. This weakness is typically introduced during the Implementation phase of software development.

How do I prevent CWE-777?

Key mitigations include: Be sure to understand both what will be matched and what will not be matched by a regular expression. Anchoring the ends of the expression will allow the programmer to define an allowlist strictly lim

What is the severity of CWE-777?

CWE-777 is classified as a Variant-level weakness (Low-Medium abstraction). It has been observed in 1 real-world CVEs.