Beware: WebUtility.UrlEncode vs HttpUtility.UrlEncode

Whilst experimenting with hash-based message authentication code (HMAC) request signing for a REST API I’m working on, I noticed that sometimes a signature would fail to validate server side, despite the hashing algorithm on both ends following the exact same algorithm. Upon closer inspection, it turned out that the client side URL encoding method was returning lowercase HEX values and the server side, when computing the string for hashing, was returning uppercase HEX values. Fair enough! The client was written in PHP and the server is .Net. There’s no standard requirement for the casing of url encoded values so some differences should be expected across platforms.

But would you expect that difference between two methods in the .Net framework? Experience tells you yes, but we can be hopeful all the same!

In most cases it makes no difference if you have %AA or %aa, but when you’re hashing a string, it makes all the difference. Armed with this knowledge, I just make a requirement that all URL encoded values used to calculate signatures are to be in uppercase a la Amazon’s approach (scroll down to the table under the heading ‘Calculating a Signature’). For those of you screaming that lowercase is generally safer than uppercase, in this case we’re ok because we’re only concerned with HEX characters (A-F) and at that point in time, it wasn’t practical to re-write the server side processing just for this one test.

Fast forward a few months and I’m at a stage where I need to use this approach for real. In this case it’s for calls between an ASP.Net MVC app and a WCF service, both .Net 4.5 projects running on Azure. I moved the code for calculating and verifying signatures in to a common library so that it can be shared between the roles. I didn’t want to have to include the System.Web assembly in this shared assembly just for the sake of the encode/decode methods and found that in .Net 4.5, the System.Net.WebUtility (introduced in .Net 4.0) class gained methods for encoding and decoding URLs. Perfect! Except now all my tests failed. This code worked perfectly in my tests, but now the signature was failing.

In my test, I created the hash using the System.Web.HttpUtility.UrlEncode as it was used in other places already, but when I changed it to use the System.Net.WebUtility.UrlEncode, the tests passed. WTF. We have two methods, both native .Net code that are encoding URL values. What causes the difference?

Digging in to the code for each implementation, I found that each version uses its own method for converting an int to a hex value (and a few other utility functions). Take a look at the code for each of those methods…

System.Net.WebUtility

private static char IntToHex(int n)
{
    if (n <= 9)
        return (char) (n + 48);
    else
        return (char) (n - 10 + 65);
}

System.Web.Util.HttpEncoderUtility (an internal class)

public static char IntToHex(int n)
{
    if (n <= 9)
        return (char) (n + 48);
    else
        return (char) (n - 10 + 97);
}

Spot the difference? That ‘+ 65′ in the first code block will result in an uppercase alpha character, but ‘+ 97′ will result in an lowercase character. You can perform a very simple test and see the results for yourself. The following code:

var test1 = WebUtility.UrlEncode("http://www.test.com/?param1=22&param2=there@is<a space");
var test2 = HttpUtility.UrlEncode("http://www.test.com/?param1=22&param2=there@is<a space");

Will result in:

test1 -> http%3A%2F%2Fwww.test.com%2F%3Fparam1%3D22%26param2%3Dthere%40is%3Ca+space
test2 -> http%3a%2f%2fwww.test.com%2f%3fparam1%3d22%26param2%3dthere%40is%3ca+space

Curiously, the help pages for both methods states the following:

For example, when embedded in a block of text to be transmitted in a URL, the characters < and > are encoded as %3c and %3e.

Which would lead one to suggest that both methods return lowercase characters for any hex values.

So what’s the solution here? Well you can just choose one method and stick to it, hoping the implementation never changes in the future. Or, you can massage the result to fit your requirements and make your code resilient to changes in casing. E.g.

var queryString = WebUtility.UrlEncode("http://www.test.com/?key1=something@something<something!");
queryString = Regex.Replace(queryString, "(%[0-9a-f]{2})", c => c.Value.ToLowerInvariant());

Remember that you still need to choose a casing and stick to it, this regex won’t help you if your client creates a signature based on lowercase HEX values and your server uses upper case.

For those of you still reading, another two gotchas when encoding / decoding URL values.

  1. When deciding on which of the two implementations to use, HttpUtility has an overload that allows you to specify which encoding to use whereas WebUtility will always use UTF-8. HttpUtility uses UTF-8 by default, but in some cases you may need to override this.
  2. You also need to be aware that some implementations will convert a space character in to a ‘+’ and others will use ‘%20′. The ‘+’ is generally used in form data, although both of the above .Net methods above use ‘+’.

Update – 26th May 2014

Thanks to Reddit user DaRKoN_ for mentioning this:

“MVC 6 has no dependency on System.Web. The result is a leaner framework, with faster startup time and lower memory consumption.”

http://www.asp.net/vnext/overview/aspnet-vnext/overview

This will help reduce the confusion for newer projects!

About these ads
Posted in .Net 4.5, HMAC
One comment on “Beware: WebUtility.UrlEncode vs HttpUtility.UrlEncode
  1. A woman whose body aches for love. A woman who’s had some heartbreak, has been hurt, has maybe even done some hurting. A woman who knows she …

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

about.me
Sam Noble

Sam Noble

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: