Categories
Posts

User Agent Sniffing at Google Libraries CDN

I recently took a closer look at Google Libraries, their content delivery network (CDN) for various Javascript libraries, and HTTP compression. I started with a simple test:

[sourcecode syn=”plain”]
curl -O -v –compressed http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

This downloads a minified version of jQuery 1.4.3, with --compressed, which means I’d like the response to be compressed. The HTTP request looked like:

[sourcecode syn=”plain”]
> GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
> User-Agent: curl/7.16.4 (i386-apple-darwin9.0) libcurl/7.16.4 OpenSSL/0.9.7l zlib/1.2.3
> Host: ajax.googleapis.com
> Accept: */*
> Accept-Encoding: deflate, gzip
>
[/sourcecode]

The response from Google was:

[sourcecode syn=”plain”]
< HTTP/1.1 200 OK
< Content-Type: text/javascript; charset=UTF-8
< Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
< Date: Fri, 29 Oct 2010 03:27:16 GMT
< Expires: Sat, 29 Oct 2011 03:27:16 GMT
< Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
< Server: sffe
< Cache-Control: public, max-age=31536000
< Age: 145355
< Transfer-Encoding: chunked
<
[/sourcecode]

I was surprised that there was no Content-Encoding: gzip header in the response, meaning the response was NOT compressed. I wasn’t quite sure what to make of this at first. No way would Google forget to turn on HTTP compression, I must have missed something. I stared at the HTTP response for sometime, trying to figure out what I was missing. Nothing came to mind, so I ran another test.

This time I made a request for http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js in Firefox 3.6.12 on Mac OS X and used Firebug to inspect the HTTP transaction. The request:

[sourcecode lang=”plain”]
GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
Host: ajax.googleapis.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
[/sourcecode]

and the response:

[sourcecode lang=”plain”]
HTTP/1.1 200 OK
Content-Type: text/javascript; charset=UTF-8
Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
Date: Fri, 29 Oct 2010 03:12:35 GMT
Expires: Sat, 29 Oct 2011 03:12:35 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Server: sffe
Content-Encoding: gzip
Cache-Control: public, max-age=31536000
Content-Length: 26769
Age: 147128
[/sourcecode]

This time the content was compressed. There were several differences in the request headers between curl and Firefox, I decided to start with just one, the “User-Agent”. I modified my initial curl request to include the User-Agent string from Firefox:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

The request:

[sourcecode syn=”plain”]
> GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
> User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
> Host: ajax.googleapis.com
> Accept: */*
> Accept-Encoding: deflate, gzip
>
[/sourcecode]

and the response:

[sourcecode syn=”plain”]
< HTTP/1.1 200 OK
< Content-Type: text/javascript; charset=UTF-8
< Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
< Date: Fri, 29 Oct 2010 03:33:09 GMT
< Expires: Sat, 29 Oct 2011 03:33:09 GMT
< Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
< Server: sffe
< Content-Encoding: gzip
< Cache-Control: public, max-age=31536000
< Content-Length: 26769
< Age: 147018
<
[/sourcecode]

Sure enough, I got back a compressed response. Google was sniffing the User-Agent string to determine if a compressed response should be sent. It didn’t matter if the client asked for a compressed response ( Accept-Encoding: deflate, gzip) or not. What still wasn’t clear is if this was a black list approach (singling out curl) or a white list approach (Firefox is okay). So I tried a few other requests with various User-Agent strings. First up, no User-Agent set at all:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

Not compressed. Next a made up string:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "JosephScott/1.0 test/2.0" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

Not compressed. At this point I think Google is using a white list approach, if you aren’t on the list of approved User-Agent strings for getting a compressed response then you won’t get one, no matter how nicely you ask.

I collected a few more browser samples as well, just to be sure:

  • Safari 5.0.2 on Mac OS X – compressed
  • IE 8 on Windows XP – compressed
  • Firefox 3.6.12 on Windows XP – compressed
  • Chrome 7.0.517.41 beta on Windows XP – compressed
  • Opera 10.63 on Windows XP – NOT compressed
  • Safari 5.0.2 on Windows XP – compressed

One more time, curl using the IE 8 User-Agent string:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET4.0C;" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

Compressed.

Since I can manipulate the response based on the User-Agent value I’m left to conclude that the Google Library CDN sniffs the User-Agent string to determine if it will respond with a compressed result. From what I’ve seen so far Google Library contains a white list of approved User-Agent patterns that it checks against to determine if it will honor the compression request.

If you are on a current version of one of the popular browsers you will get a compressed response. For those using anything else you’ll have to test to confirm if Google Library will honor your request for compressed content. Opera users are just plain out of luck, even the most recent version gets an uncompressed response.