HOWTO: Prevent Cross-Site Scripting Security Issues

ID: Q252985


The information in this article applies to:
  • Dynamically Generated HTML Pages


SUMMARY

Dynamically generated HTML pages can introduce security risks if inputs are not validated either on the way in or on the way out. Malicious script can be embedded within input submitted to Web pages and appear to browsers as originating from a trusted source. This problem is referred to as a cross-site security scripting issue. This article discusses cross-site scripting security issues, the ramifications, and prevention.


MORE INFORMATION

The Problem

The underlying problem is that many Web pages display input that is not validated. If input is not validated, then malicious script can be embedded within the input. If a server-side script then displays this non-validated input the the script runs on the browser as though the trusted site generated it.

Ramifications

If input to your dynamic Web pages is not validated, you may encounter the following:
  • Data integrity can be compromised.
  • Cookies can be set and read.
  • User input can be intercepted.
  • Malicious scripts can be executed by the client in the context of the trusted source.
Which Web pages are at risk? Essentially, the problem affects dynamic page creation based on input that was not validated. Typical examples include the following types of Web pages:
  • Search engines that return results pages based on user input
  • Login pages that store user accounts in databases, cookies, and so forth, and later write the user name out to the client
  • Web forms that process credit-card information

Prevention

Presented here are a few approaches to preventing cross-site scripting security attacks. Evaluate your specific situation to determine which techniques will work best for you. It is important to note that in all techniques, you are validating data that you receive from input, and not your trusted script. Essentially, prevention means that you follow good coding practice by running sanity checks on your input to your routines.

The following general approaches for preventing cross-site scripting attacks are presented here:
  • Encode output based on input parameters.
  • Filter input parameters for special characters.
  • Filter output based on input parameters for special characters.
When filtering or encoding, you need to specify a character set for your Web pages to ensure that your filter is checking for the appropriate special characters. The data inserted into your Web pages should filter out byte sequences that are considered special based on the specific character set. A popular charset is ISO 8859-1, which was the default in early versions of HTML and HTTP. You must take into account localization issues when changing these parameters.

Encode Output Based on Input Parameters for Special Characters
Encode data received as input when you write it out as HTML. This technique is effective on data that was not validated for some reason during input. By using techniques such as URLEncode and HTMLEncode, you can prevent malicious script from executing.

The following code snippets show using URLEncode and HTMLEncode from Active Server Pages (ASP) pages:

<%
      var BaseURL = http://www.mysite.com/search2.asp?searchagain=;
      Response.write("<a href=\"" + BaseUrl +
      Server.URLEncode(Request.QueryString("SearchString")) +
      "\">click-me</a>");
%>
<% Response.Write("Hello visitor <I>" +
      Server.HTMLEncode(Request.Form("UserName")) +
      "</I>");
%> 
If you encode the HTML and URLs, you may need to specify the code page as you would if you were to filter data.

It is important to note that calling HTMLEncode on the string that is about to be displayed will prevent any script in it from being executed and thus preventing the problem.

Filter Input Parameters for Special Characters
Filtering input works by removing some or all special characters from your input. Special characters are characters that enable script to be generated within an HTML stream. Special characters include the following:

< > " ' % ; ) ( & + - 
Note that your individual situation may warrant the filtering of additional characters or strings beyond the special characters.

While filtering can be an effective technique, there are a few caveats:
  • Filtering may not be appropriate for some input. For example, in scenarios where you are receiving <TEXT> input from an HTML form, you may instead choose a method such as encoding (see below).


  • Some filtered characters may actually be required input to server-side script.


Here is a sample filter written in JavaScript demonstrating how to remove special characters:

function RemoveBad(strTemp) { 
    strTemp = strTemp.replace(/\<|\>|\"|\'|\%|\;|\(|\)|\&|\+|\-/g,""); 
    return strTemp;
} 

 
This code would process user input before storing it for later use.

<% Session("StoredPreference") = RemoveBad(Request.Cookies("UserColor"));
         var TempStr = RemoveBad(Request.QueryString("UserName"));
         var SQLString = "INSERT TableName VALUES (' " +
         RemoveBad(Request.Form("UserName") + " ' )";
 
Filter Output Based on Input Parameters for Special Characters
This technique is similar to filtering input except that you filter characters that are written out to the client. While this can be an effective technique, it may present a problem for Web pages that write out HTML elements.

For example, on a page that writes out <TABLE> elements, a generic function that removes the special characters would strip the < and > characters ruining the <TABLE> tag. Therefore, in order for this technique to be useful, you would only filter data passed in or data that was previously entered by a user and stored in a database.

Possible Sources of Malicious Data
While the problem applies to any page that uses input to dynamically generate HTML, the following are some possible sources of malicious data to help you spot check for potential security risks:
  • Query String
  • Cookies
  • Posted data
  • URLs and pieces of URLs, such as PATH_INFO
  • Data retrieved from users that is persisted in some fashion such as in a database

Conclusion

In conclusion, the following are key points to remember regarding the cross-site scripting security problem:
  • The problem affects dynamic page creation based on input that was not validated.
  • Omission of a sanity check on input data can have unexpected security implications. The problem is preventable through good development standards such as input validation.
  • You need to evaluate solutions on a per site, page, and even field basis and use a technique that makes sense.


REFERENCES

For more information, see the following advisory from the Computer Emergency Response Team (CERT) at Carnegie Mellon University:

http://www.cert.org/advisories/CA-2000-02.html
For additional information, click the article numbers below to view the articles in the Microsoft Knowledge Base:
Q253117 HOWTO: Prevent Internet Explorer and Outlook Express CSSI Vulnerability
Q253119 HOWTO: Review ASP Code for CSSI Vulnerability
Q253120 HOWTO: Review Visual InterDev Generated Code for CSSI Vulnerability
Q253121 HOWTO: Review MTS/ASP Code for CSSI Vulnerability

Additional query words:

Keywords : kbGrpASP kbDSupport
Version :
Platform :
Issue type : kbhowto


Last Reviewed: February 3, 2000
© 2000 Microsoft Corporation. All rights reserved. Terms of Use.