Categories
Posts

User Agent Sniffing at Google Libraries CDN

I recently took a closer look at Google Libraries, their content delivery network (CDN) for various Javascript libraries, and HTTP compression. I started with a simple test:

[sourcecode syn=”plain”]
curl -O -v –compressed http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

This downloads a minified version of jQuery 1.4.3, with --compressed, which means I’d like the response to be compressed. The HTTP request looked like:

[sourcecode syn=”plain”]
> GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
> User-Agent: curl/7.16.4 (i386-apple-darwin9.0) libcurl/7.16.4 OpenSSL/0.9.7l zlib/1.2.3
> Host: ajax.googleapis.com
> Accept: */*
> Accept-Encoding: deflate, gzip
>
[/sourcecode]

The response from Google was:

[sourcecode syn=”plain”]
< HTTP/1.1 200 OK
< Content-Type: text/javascript; charset=UTF-8
< Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
< Date: Fri, 29 Oct 2010 03:27:16 GMT
< Expires: Sat, 29 Oct 2011 03:27:16 GMT
< Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
< Server: sffe
< Cache-Control: public, max-age=31536000
< Age: 145355
< Transfer-Encoding: chunked
<
[/sourcecode]

I was surprised that there was no Content-Encoding: gzip header in the response, meaning the response was NOT compressed. I wasn’t quite sure what to make of this at first. No way would Google forget to turn on HTTP compression, I must have missed something. I stared at the HTTP response for sometime, trying to figure out what I was missing. Nothing came to mind, so I ran another test.

This time I made a request for http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js in Firefox 3.6.12 on Mac OS X and used Firebug to inspect the HTTP transaction. The request:

[sourcecode lang=”plain”]
GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
Host: ajax.googleapis.com
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
[/sourcecode]

and the response:

[sourcecode lang=”plain”]
HTTP/1.1 200 OK
Content-Type: text/javascript; charset=UTF-8
Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
Date: Fri, 29 Oct 2010 03:12:35 GMT
Expires: Sat, 29 Oct 2011 03:12:35 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Server: sffe
Content-Encoding: gzip
Cache-Control: public, max-age=31536000
Content-Length: 26769
Age: 147128
[/sourcecode]

This time the content was compressed. There were several differences in the request headers between curl and Firefox, I decided to start with just one, the “User-Agent”. I modified my initial curl request to include the User-Agent string from Firefox:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

The request:

[sourcecode syn=”plain”]
> GET /ajax/libs/jquery/1.4.3/jquery.min.js HTTP/1.1
> User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12
> Host: ajax.googleapis.com
> Accept: */*
> Accept-Encoding: deflate, gzip
>
[/sourcecode]

and the response:

[sourcecode syn=”plain”]
< HTTP/1.1 200 OK
< Content-Type: text/javascript; charset=UTF-8
< Last-Modified: Fri, 15 Oct 2010 18:25:24 GMT
< Date: Fri, 29 Oct 2010 03:33:09 GMT
< Expires: Sat, 29 Oct 2011 03:33:09 GMT
< Vary: Accept-Encoding
< X-Content-Type-Options: nosniff
< Server: sffe
< Content-Encoding: gzip
< Cache-Control: public, max-age=31536000
< Content-Length: 26769
< Age: 147018
<
[/sourcecode]

Sure enough, I got back a compressed response. Google was sniffing the User-Agent string to determine if a compressed response should be sent. It didn’t matter if the client asked for a compressed response ( Accept-Encoding: deflate, gzip) or not. What still wasn’t clear is if this was a black list approach (singling out curl) or a white list approach (Firefox is okay). So I tried a few other requests with various User-Agent strings. First up, no User-Agent set at all:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

Not compressed. Next a made up string:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "JosephScott/1.0 test/2.0" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

Not compressed. At this point I think Google is using a white list approach, if you aren’t on the list of approved User-Agent strings for getting a compressed response then you won’t get one, no matter how nicely you ask.

I collected a few more browser samples as well, just to be sure:

  • Safari 5.0.2 on Mac OS X – compressed
  • IE 8 on Windows XP – compressed
  • Firefox 3.6.12 on Windows XP – compressed
  • Chrome 7.0.517.41 beta on Windows XP – compressed
  • Opera 10.63 on Windows XP – NOT compressed
  • Safari 5.0.2 on Windows XP – compressed

One more time, curl using the IE 8 User-Agent string:

[sourcecode syn=”plain”]
curl -O -v –compressed –user-agent "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET4.0C;" http://ajax.googleapis.com/ajax/libs/jquery/1.4.3/jquery.min.js
[/sourcecode]

Compressed.

Since I can manipulate the response based on the User-Agent value I’m left to conclude that the Google Library CDN sniffs the User-Agent string to determine if it will respond with a compressed result. From what I’ve seen so far Google Library contains a white list of approved User-Agent patterns that it checks against to determine if it will honor the compression request.

If you are on a current version of one of the popular browsers you will get a compressed response. For those using anything else you’ll have to test to confirm if Google Library will honor your request for compressed content. Opera users are just plain out of luck, even the most recent version gets an uncompressed response.

11 replies on “User Agent Sniffing at Google Libraries CDN”

We actually found out that many servers do similar things. We have the same pattern with the ‘Connection: Close’ header. Some servers will automatically close the connection based on the UA (browsers mostly), or will keep it alive (Bots).

I’m not sure if it is reason enough to avoid using the Google Library CDN, but it does raise a few more questions.

As for Hoffman’s post, he brings up good points, but tries to apply them way too broadly. To say that avoiding a CDN for your Javascript files is the right answer for every site on the web is over simplifying by a long (LONG) shot.

Great post, Joseph! Using a whitelist even in the presence of Accept-Encoding has merit (I discuss this in HPWS). However, Opera should be in the whitelist afaik. I’ll contact the Google Libraries API folks.

Thanks. As for a whitelist approach, are there really that many new HTTP clients coming out that have broken compression support? If most of the broken ones are older then it seems like less work to simply black list known problem clients. As the Opera item highlights, there is often little motivation in updating a white list. A black list might cause more pain in the short term (when a new broken client shows up), but that pain would lead to motivation in making sure the list is up to date.

Making the white list that Google Library is using to determine feature support public would be really helpful. Including test methodologies and contact info for sending in updates would allow others to help keep the list up to date.

Does Google’s CDN work well while using particular browser? (I mean Chrome of course). I’ve just recently found that my site’s speed is different in different browsers. I’m using cdnsun and I’ve found that site works more productive in Mozilla only. So, tell me guys: may google’s cdn show the best of it in chrome? Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *