CWE-134: Use of Externally-Controlled Format String

Description

The product uses a function that accepts a format string as an argument, but the format string originates from an external source.

Potential Impact

Confidentiality

Read Memory

Integrity, Confidentiality, Availability

Modify Memory, Execute Unauthorized Code or Commands

Demonstrative Examples

The following program prints a string provided as an argument.

Bad

#include <stdio.h>
                     void printWrapper(char *string) {
                        
                           printf(string);
                     }
                     int main(int argc, char **argv) {
                        
                           char buf[5012];memcpy(buf, argv[1], 5012);printWrapper(argv[1]);return (0);
                     }

The example is exploitable, because of the call to printf() in the printWrapper() function. Note: The stack buffer was added to make exploitation more simple.

The following code copies a command line argument into a buffer using snprintf().

Bad

int main(int argc, char **argv){char buf[128];...snprintf(buf,128,argv[1]);}

This code allows an attacker to view the contents of the stack and write to the stack using a command line argument containing a sequence of formatting directives. The attacker can read from the stack by providing more formatting directives, such as %x, than the function takes as arguments to be formatted. (In this example, the function takes no arguments to be formatted.) By using the %n formatting directive, the attacker can write to the stack, causing snprintf() to write the number of bytes output thus far to the specified argument (rather than reading a value from the argument, which is the intended behavior). A sophisticated version of this attack will use four staggered writes to completely control the value of a pointer on the stack.

Certain implementations make more advanced attacks even easier by providing format directives that control the location in memory to read from or write to. An example of these directives is shown in the following code, written for glibc:

Bad

printf("%d %d %1$d %1$d\n", 5, 9);

This code produces the following output: 5 9 5 5 It is also possible to use half-writes (%hn) to accurately control arbitrary DWORDS in memory, which greatly reduces the complexity needed to execute an attack that would otherwise require four staggered writes, such as the one mentioned in a separate example.

Mitigations & Prevention

Requirements

Choose a language that is not subject to this flaw.

Implementation

Ensure that all format string functions are passed a static string which cannot be controlled by the user, and that the proper number of arguments are always sent to that function as well. If at all possible, use functions that do not support the %n operator in format strings. [REF-116] [REF-117]

Build and Compilation

Run compilers and linkers with high warning levels, since they may detect incorrect usage.

Detection Methods

Automated Static Analysis — This weakness can often be detected using automated static analysis tools. Many modern tools use data flow analysis or constraint-based techniques to minimize the number of false positives.
Black Box Limited — Since format strings often occur in rarely-occurring erroneous conditions (e.g. for error message logging), they can be difficult to detect using black box methods. It is highly likely that many latent issues exist in executables that do not have associated source code (or equivalent source.
Automated Static Analysis - Binary or Bytecode High — According to SOAR [REF-1479], the following detection techniques may be useful:
Manual Static Analysis - Binary or Bytecode SOAR Partial — According to SOAR [REF-1479], the following detection techniques may be useful:
Dynamic Analysis with Automated Results Interpretation SOAR Partial — According to SOAR [REF-1479], the following detection techniques may be useful:
Dynamic Analysis with Manual Results Interpretation SOAR Partial — According to SOAR [REF-1479], the following detection techniques may be useful:

Real-World CVE Examples

CVE ID	Description
CVE-2002-1825	format string in Perl program
CVE-2001-0717	format string in bad call to syslog function
CVE-2002-0573	format string in bad call to syslog function
CVE-2002-1788	format strings in NNTP server responses
CVE-2006-2480	Format string vulnerability exploited by triggering errors or warnings, as demonstrated via format string specifiers in a .bmp filename.
CVE-2007-2027	Chain: untrusted search path enabling resultant format string by loading malicious internationalization messages

Taxonomy Mappings

PLOVER: — Format string vulnerability
7 Pernicious Kingdoms: — Format String
CLASP: — Format string problem
CERT C Secure Coding: FIO30-C — Exclude user input from format strings
CERT C Secure Coding: FIO47-C — Use valid format strings
OWASP Top Ten 2004: A1 — Unvalidated Input
WASC: 6 — Format String
The CERT Oracle Secure Coding Standard for Java (2011): IDS06-J — Exclude user input from format strings
SEI CERT Perl Coding Standard: IDS30-PL — Exclude user input from format strings
Software Fault Patterns: SFP24 — Tainted input to command
OMG ASCSM: ASCSM-CWE-134 —

Frequently Asked Questions

What is CWE-134?

CWE-134 (Use of Externally-Controlled Format String) is a software weakness identified by MITRE's Common Weakness Enumeration. It is classified as a Base-level weakness. The product uses a function that accepts a format string as an argument, but the format string originates from an external source.

How can CWE-134 be exploited?

Attackers can exploit CWE-134 (Use of Externally-Controlled Format String) to read memory. This weakness is typically introduced during the Implementation, Implementation phase of software development.

How do I prevent CWE-134?

Key mitigations include: Choose a language that is not subject to this flaw.

What is the severity of CWE-134?

CWE-134 is classified as a Base-level weakness (Medium abstraction). It has been observed in 6 real-world CVEs.