Categories
Blogging Web

RSS Scaling Problems

In recently the problem of RSS scaling has been brought up by InfoWorld’s Chad Dickerson, Mark Fletcher of Bloglines and Jeremy Zawodny. If you are into trends the obvious thing to say is something like “can’t we just add in some sort of BitTorrent feature?”. Honestly this was the very first thing that popped into my head too, but it felt like it would introduce too many complicated issues for something that feels like it needs a relatively simple solution. The next thing I thought of was FreeCache.

From what I can tell I believe FreeCache (or something like it) best fits the bill for distributing the RSS load. The aggregators won’t have to understand any new protocols, everything continues to be plain HTTP, only the URLs will change. The changes to the URLs would be easy to for everyone to adapt to, simply add http://freecache.org/ to the start of RSS URLs. Now that I think about this has another huge advantage, aggregator authors could make use of this right now by prepending the previously mentioned URL to the RSS URLs entered by a user. If an aggregator wanted to be really smart it could simply check to see if the RSS URL begins with http://freecache.org/ and if it doesn’t automatically add it to take advantage of FreeCache.

This picture isn’t perfect though, the problem that immediately jumps out at me is the update frequency issue. For RSS feeds that update often caching system might not be able to keep up. If the bulk of the problem is at one single point, like at the top of every hour because of desktop aggregators, even frequently updated RSS feeds would still be able to benefit from caching.

So to answer Chad, Mark and Jeremy’s question about RSS scaling problems I suggest taking a hard look at FreeCache. Even if it isn’t the best solution right now it may still be a good starting point.

2 replies on “RSS Scaling Problems”

you read the slashdot article? NNTP is designed for this sole purpose of a topology. Also, if aggregators were designed better, HTTP has the Modified header and they would cause significantly less traffic. Algorythms for randomizing subscription requests could also help.

Have fun! Also, check out http://writtorrent.sf.net/

From what I read the biggest problem is not bandwidth, it is the sudden load spike on the systems hosting the RSS feeds when many clients try to get updates at the top of each hour. To reduce the load the obvious thing to do is spread it out, not to reduce bandwidth but to reduce system load. Using something like FreeCache would reduce both load and bandwidth.

Another really important factor that no one seems to have mentioned is that this should be transparent to the aggregators. Throwing in things like BitTorrent and new HTTP extensions are great ideas, but getting aggregators to support them may never happen. Again, using something like FreeCache requires no changes to any of the RSS/ATOM clients out there. It’s a solution that will work right now, which is generally better that proposing solutions that will only work when everyone rewrites their software.

Leave a Reply

Your email address will not be published. Required fields are marked *