CWE-94: Improper Control of Generation of Code ('Code Injection')

Description

The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.

☞

SSTI Exploitation Guide

Read our in-depth guide on exploiting and mitigating this weakness

Potential Impact

Access Control

Bypass Protection Mechanism

Access Control

Gain Privileges or Assume Identity

Integrity, Confidentiality, Availability

Execute Unauthorized Code or Commands

Non-Repudiation

Hide Activities

Demonstrative Examples

This example attempts to write user messages to a message file and allow users to view them.

Bad

$MessageFile = "messages.out";if ($_GET["action"] == "NewMessage") {$name = $_GET["name"];$message = $_GET["message"];$handle = fopen($MessageFile, "a+");fwrite($handle, "<b>$name</b> says '$message'<hr>\n");fclose($handle);echo "Message Saved!<p>\n";}else if ($_GET["action"] == "ViewMessages") {include($MessageFile);}

While the programmer intends for the MessageFile to only include data, an attacker can provide a message such as:

Attack

name=h4x0rmessage=%3C?php%20system(%22/bin/ls%20-l%22);?%3E

which will decode to the following:

Attack

<?php system("/bin/ls -l");?>

The programmer thought they were just including the contents of a regular data file, but PHP parsed it and executed the code. Now, this code is executed any time people view messages.

Notice that XSS (CWE-79) is also possible in this situation.

edit-config.pl: This CGI script is used to modify settings in a configuration file.

Bad

use CGI qw(:standard);
                     sub config_file_add_key {
                        my ($fname, $key, $arg) = @_;
                           
                           # code to add a field/key to a file goes here
                           
                        
                     }
                     sub config_file_set_key {
                        my ($fname, $key, $arg) = @_;
                           
                           # code to set key to a particular file goes here
                           
                        
                     }
                     sub config_file_delete_key {
                        my ($fname, $key, $arg) = @_;
                           
                           # code to delete key from a particular file goes here
                           
                        
                     }
                     sub handleConfigAction {
                        my ($fname, $action) = @_;my $key = param('key');my $val = param('val');
                           
                           # this is super-efficient code, especially if you have to invoke
                           
                           
                           
                           # any one of dozens of different functions!
                           
                           my $code = "config_file_$action_key(\$fname, \$key, \$val);";eval($code);
                     }
                     $configfile = "/home/cwe/config.txt";print header;if (defined(param('action'))) {handleConfigAction($configfile, param('action'));}else {print "No action specified!\n";}

The script intends to take the 'action' parameter and invoke one of a variety of functions based on the value of that parameter - config_file_add_key(), config_file_set_key(), or config_file_delete_key(). It could set up a conditional to invoke each function separately, but eval() is a powerful way of doing the same thing in fewer lines of code, especially when a large number of functions or variables are involved. Unfortunately, in this case, the attacker can provide other values in the action parameter, such as:

Attack

add_key(",","); system("/bin/ls");

This would produce the following string in handleConfigAction():

Result

config_file_add_key(",","); system("/bin/ls");

Any arbitrary Perl code could be added after the attacker has "closed off" the construction of the original function call, in order to prevent parsing errors from causing the malicious eval() to fail before the attacker's payload is activated. This particular manipulation would fail after the system() call, because the "_key(\$fname, \$key, \$val)" portion of the string would cause an error, but this is irrelevant to the attack because the payload has already been activated.

This simple python3 script asks a user to supply a comma-separated list of numbers as input and adds them together.

Bad

def main():
                  
                    sum = 0
		    try:
		    
                      numbers = eval(input("Enter a comma-separated list of numbers: "))
		    
		    except SyntaxError:
		    
		      print("Error: invalid input")
		      return
		    
                    for num in numbers:
                    
                      sum = sum + num
                    
                    print(f"Sum of {numbers} = {sum}")
                  
                  main()

The eval() function can take the user-supplied list and convert it into a Python list object, therefore allowing the programmer to use list comprehension methods to work with the data. However, if code is supplied to the eval() function, it will execute that code. For example, a malicious user could supply the following string:

Attack

__import__('subprocess').getoutput('rm -r *')

This would delete all the files in the current directory. For this reason, it is not recommended to use eval() with untrusted input.

A way to accomplish this without the use of eval() is to apply an integer conversion on the input within a try/except block. If the user-supplied input is not numeric, this will raise a ValueError. By avoiding eval(), there is no opportunity for the input string to be executed as code.

Good

def main():
                  
                    sum = 0
                    numbers = input("Enter a comma-separated list of numbers: ").split(",")
                    try:
                    
                      for num in numbers:
                      
                        sum = sum + int(num)
                      
                      print(f"Sum of {numbers} = {sum}")
                    
                    except ValueError:
                    
                      print("Error: invalid input")
                    
                  
                  main()

An alternative, commonly-cited mitigation for this kind of weakness is to use the ast.literal_eval() function, since it is intentionally designed to avoid executing code. However, an adversary could still cause excessive memory or stack consumption via deeply nested structures [REF-1372], so the python documentation discourages use of ast.literal_eval() on untrusted data [REF-1373].

The following code is a workflow job written
	      using YAML. The code attempts to download pull request
	      artifacts, unzip from the artifact called pr.zip and
	      extract the value of the file NR into a variable
	      "pr_number" that will be used later in another job.  It
	      attempts to create a github workflow environment
	      variable, writing to $GITHUB_ENV. The environment
	      variable value is retrieved from an external
	      resource.

Bad

name: Deploy Preview
		jobs:
		
		  deploy:
		  
		    runs-on: ubuntu-latest
		    steps:
		    
		      - name: 'Download artifact'
		      uses: actions/github-script
		      with:
		      
			script: |
			
			  var artifacts = await github.actions.listWorkflowRunArtifacts({
			  
			    owner: context.repo.owner,
			    repo: context.repo.repo,
			    run_id: ${{ github.event.workflow_run.id }},
			  
			  });
			  var matchPrArtifact = artifacts.data.artifacts.filter((artifact) => {
			  
			    return artifact.name == "pr"
			  
			  })[0];
			  var downloadPr = await github.actions.downloadArtifact({
			  
			    owner: context.repo.owner,
			    repo: context.repo.repo,
			    artifact_id: matchPrArtifact.id,
			    archive_format: 'zip',
			  
			  });
			  var fs = require('fs');
			  fs.writeFileSync('${{github.workspace}}/pr.zip', Buffer.from(downloadPr.data));
			
		      
		      - run: |
		      
			unzip pr.zip
			echo "pr_number=$(cat NR)" >> $GITHUB_ENV

Attack

\nNODE_OPTIONS="--experimental-modules --experiments-loader=data:text/javascript,console.log('injected code');//"

Good

The code could be modified to validate that the NR
		file only contains a numeric value, or the code could
		retrieve the PR number from a more trusted source.

Mitigations & Prevention

Architecture and Design

Refactor your program so that you do not have to dynamically generate code.

Architecture and Design

Run your code in a "jail" or similar sandbox environment that enforces strict boundaries between the process and the operating system. This may effectively restrict which code can be executed by your product. Examples include the Unix chroot jail and AppArmor. In general, managed code may provide some protection. This may not be a feasible solution, and it only limits the impact to the operating system; the rest of your application may still be subject to

Implementation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across relat

Testing

Use dynamic tools and techniques that interact with the product using large test suites with many diverse inputs, such as fuzz testing (fuzzing), robustness testing, and fault injection. The product's operation may slow down, but it should not become unstable, crash, or generate incorrect results.

Operation

Run the code in an environment that performs automatic taint propagation and prevents any command execution that uses tainted variables, such as Perl's "-T" switch. This will force the program to perform validation steps that remove the taint, although you must be careful to correctly validate your inputs so that you do not accidentally mark dangerous inputs as untainted (see CWE-183 and CWE-184).

Operation

Implementation Discouraged Common Practice

For Python programs, it is frequently encouraged to use the ast.literal_eval() function instead of eval, since it is intentionally designed to avoid executing code. However, an adversary could still cause excessive memory or stack consumption via deeply nested structures [REF-1372], so the python documentation discourages use of ast.literal_eval() on untrusted data [REF-1373].

Detection Methods

Automated Static Analysis High — Automated static analysis, commonly referred to as Static Application Security Testing (SAST), can find some instances of this weakness by analyzing source code (or binary/compiled code) without having to execute it. Typically, this is done by building a model of data flow and control flow, then sea

Real-World CVE Examples

CVE ID	Description
CVE-2023-29374	Math component in an LLM framework translates user input into a Python expression that is input into the Python exec() method, allowing code execution - one variant of a "prompt injection"
CVE-2024-5565	Python-based library uses an LLM prompt containing user input to dynamically generate code that is then fed as input into the Python exec() method, allowing code execution - one variant of
CVE-2024-4181	Framework for LLM applications allows eval injection via a crafted response from a hosting provider.
CVE-2022-2054	Python compiler uses eval() to execute malicious strings as Python code.
CVE-2021-22204	Chain: regex in EXIF processor code does not correctly determine where a string ends (CWE-625), enabling eval injection (CWE-95), as exploited in the wild per CISA KEV.
CVE-2020-8218	"Code injection" in VPN product, as exploited in the wild per CISA KEV.
CVE-2008-5071	Eval injection in PHP program.
CVE-2002-1750	Eval injection in Perl program.
CVE-2008-5305	Eval injection in Perl program using an ID that should only contain hyphens and numbers.
CVE-2002-1752	Direct code injection into Perl eval function.
CVE-2002-1753	Eval injection in Perl program.
CVE-2005-1527	Direct code injection into Perl eval function.
CVE-2005-2837	Direct code injection into Perl eval function.
CVE-2005-1921	MFV. code injection into PHP eval statement using nested constructs that should not be nested.
CVE-2005-2498	MFV. code injection into PHP eval statement using nested constructs that should not be nested.

Showing 15 of 22 observed examples.

Taxonomy Mappings

PLOVER: CODE — Code Evaluation and Injection
ISA/IEC 62443: Part 4-2 — Req CR 3.5
ISA/IEC 62443: Part 3-3 — Req SR 3.5
ISA/IEC 62443: Part 4-1 — Req SVV-1
ISA/IEC 62443: Part 4-1 — Req SVV-3

Frequently Asked Questions

What is CWE-94?

CWE-94 (Improper Control of Generation of Code ('Code Injection')) is a software weakness identified by MITRE's Common Weakness Enumeration. It is classified as a Base-level weakness. The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could mod...

How can CWE-94 be exploited?

Attackers can exploit CWE-94 (Improper Control of Generation of Code ('Code Injection')) to bypass protection mechanism. This weakness is typically introduced during the Implementation phase of software development.

How do I prevent CWE-94?

Key mitigations include: Refactor your program so that you do not have to dynamically generate code.

What is the severity of CWE-94?

CWE-94 is classified as a Base-level weakness (Medium abstraction). It has been observed in 22 real-world CVEs.

Description

SSTI Exploitation Guide

Potential Impact

Access Control

Access Control

Integrity, Confidentiality, Availability

Non-Repudiation

Demonstrative Examples

Mitigations & Prevention

Detection Methods

Real-World CVE Examples

Related Weaknesses

CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')

CWE-913: Improper Control of Dynamically-Managed Code Resources

Taxonomy Mappings

Frequently Asked Questions