Categories
Posts

RSSCloud For WordPress

RSSCloud support has been enabled on all WordPress.com blogs. If you are running a WordPress.org powered blog you can do the same thing with the RSSCloud plugin.

So what does this really mean?

From the point of view of WordPress and this plugin there are three main additions:

  1. Adds the <cloud> element to your RSS2 feed (more details here and here) which tells clients where and how to sign up for notification requests.
  2. Registers a URL handler with WordPress to process the notification signups.
  3. Sends out notification updates when a new post is published.

The cloud element looks like this:

[sourcecode lang=”html”]
<cloud domain=’josephscott.org’ port=’80’ path=’/?rsscloud=notify’
registerProcedure=” protocol=’http-post’ />
[/sourcecode]

The domain, port and path attributes combined to form a URL, http://josephscott.org:80/?rsscloud=notify in this case, where others sign up for notifications. The registerProcedure attribute is the XML-RPC method to be called if the protocol attribute was xmlrpc. Since the plugin uses http-post for the protocol the registerProcedure field is blank.

Using this same example here is a small chunk of PHP code that uses the cURL library to sign up for notifications:

[sourcecode lang=”php”]
$curl = curl_init();

curl_setopt($curl, CURLOPT_URL, ‘http://josephscott.org/?rsscloud=notify’ );
curl_setopt($curl, CURLOPT_POSTFIELDS, ‘notifyProcedure=&protocol=http-post&port=80&path=/~joseph/rsscloud/&url1=http://josephscott.org/feed/’ );
curl_setopt($curl, CURLOPT_VERBOSE, 1);
curl_setopt($curl, CURLOPT_POST, 1);

curl_exec( $curl );
print_r( curl_getinfo( $curl ) );
curl_close( $curl );
[/sourcecode]

This code sends an HTTP POST request to http://josephscott.org/?rsscloud=notify asking to get a notification when the http://josephscott.org/feed/ feed is updated. The notification is to be sent to the remote IP used in the request (this means notification requests must be sent from the IP that will be receiving the notifications), port 80 with a path of /~joseph/rsscloud/ and it will be given the update data via an HTTP POST. The notification script will get $_POST data that looks like:

[sourcecode lang=”php”]
Array
(
[url] => http://josephscott.org/feed/
)
[/sourcecode]

It is then up to notification script to turn around fetch the updated feed.

How fast does all this happen?

It depends 🙂

On WordPress.com the notifications happen through the jobs system, which means it will be sent out very, very quickly. On a WordPress.org powered blog with the plugin it schedules notifications to get sent out as soon as possible with the wp_schedule_single_event( ) function. Scheduled events in WordPress are checked on each page load, so if you publish a new post and then view it on the front page of your blog the notifications will get sent out in pretty quick.

I think for most blogs these approaches will work fine and send out notifications with very little delay.

What does this mean for feed readers like Bloglines, Google Reader, etc.?

I believe that many (most?) public feed readers like Bloglines and Google Reader already listen for feed updates via pings (like those sent to Ping-O-Matic). With an RSSCloud enabled WordPress blog they can register for updates to specific feeds. Why would they do this if they are already getting ping updates? Unfortunately the ping updates are similar to email, they have massive amounts of spam in them. Since RSSCloud isn’t a stream of everything, but a specific request for specific updates they could sign up for updates to those feeds that they believe are more likely to be legitimate.

Signing up is simple for a feed reader (or anyone/thing) to do:

  1. Look for the <cloud> element in the RSS feed
  2. Sign up for notifications using the data from the <cloud> element
  3. Process notification that are sent to it from WordPress

Right now I believe the only feed reader that supports RSSCloud is Dave Winer‘s River2.

If you are working on RSSCloud support in your feed reader let me know, I’ll be watching the RSSCloud stats on WordPress.com. And of course if you run into problems with RSSCloud on a WordPress blog (ORG or COM) I’m happy to help track down any bugs in our implementation.

62 replies on “RSSCloud For WordPress”

I don’t know what Feedburner does with that element. It’s possible that they will strip it out. I’d be interested to hear from someone who has their feeds through Feedburner and has this enabled.

Very cool! Congrats on getting this out and for all the buzz it is getting. I’m the author of the WP plugin for PubSubHubbub and am curious if you would be interested in combining our code and releasing a hybrid plugin that adds support for both PubSubHubbub and rssCloud at the same time. As I see it, it’s still up in the air which protocol will win, but both represent huge progress for the internet. Let me know if you are interested in working together. I will probably try and add rssCloud support either way, but I’d love to have you on board.

Another nice plugin, I have feeds through Feedburner, and I’m gonna try your plugin pretty soon to see what happens.

Thanks

In it’s current setup it will try to pump them all out. There is a timeout on each notification so that one super slow notification can’t try to hold on for 15 minutes before responding.

I see. So in a possible worst-case scenario each responsive consumer of notifications would block wp_cron() for RSSCLOUD_HTTP_TIMEOUT, i.e. 3 seconds.

Does WP.com run any instrumentation to track the cost of transmitting into the cloud in real world usage?

There is a threshold where the plugin will mark a notification URL as inactive after a certain number of failures. This is another layer of protection against problem notification URLs.

I’ve been adding various stats around the RSSCloud feature to WordPress.com so that we can track trends for things like that.

Thanks for the plugin. I have a question. I have FeedBurner FeedSmith plugin running on my site. Will the two conflict so that one is not causing the other to work?

Thanks.

Some questions/suggestions

1. Is it going to make it into the core in 2.9 or 2.10? IMHO it should.

2. Glancing at the plugin code it seems like the RSSCloud is not enabled for comment feeds. Is it possible to make it configurable in the next version of the plugin?

Using wp-cron makes it so that there’s no way for the resulting network to be loosely coupled!

As more and more people subscribe to my cloud instead of going to my blog, the cron job will run only when YAHOO and Google come by to index my page or every 24 hours when the cloud subscribers resubscribe.

wp_cron is in no way a robust system and while I gave up advocating against it in WordPress, I think that this plug-in needs a way for the normal user to delegate the cloud-pinging to an external service. My idea would be the following:

* Create an additional XML element in its own namespace where a central cloud server can get an OPML list of all subscribers
* Make it possible for the -element to point to the central cloud server
* Make it possible for the blog-owner to delegate the actual cloud-pinging to the central service

The only real alternative to that is to make it possible out-of-the-box to ping everyone in the cloud while submitting a new post.

“Will work for most users” is in my opinion code for “might take up to 24 hours for a message to reach your friends”, which does not really work.

Or am I wrong?

My travel blog gets about 30 hits a month from relatives and does not allow for indexing through external robots. How would I enable a realtime web-worthy cloud element on it?

1. Haven’t talked about making it part of WP core yet. I think while things are still new it is best to leave it as a plugin and then consider the question again later on.

2. Correct, it currently only supports your post RSS feed. Adding support for the comment feed is coming.

It can indeed be easily loosely coupled. The plugin has many parts that you can override. This is exactly what we are doing on WordPress.com, where the notifications are sent out via our internal jobs system.

My original design of the plugin did include a configuration screen to set the cloud element data, making it easy to send the updates to an external hub that would then send out the actual notifications. To keep things simple at first I choose to delay that for a future release.

As for how soon it will reach your friends, there are many more pieces that just WordPress that are involved in that. What RSSCloud targets is being able to notify others (usually other systems, specifically feed readers) that a feed has been updated. Making wp_cron fire in WordPress is as simple as viewing your post after you hit the publish button.

Very Nice! I’m also courious on what code plugin you use as I really like the fact that it wraps long lines, mine doesn’t! Could you recommend the one you use?

I can confirm it. The cloud element passes through feedburner untouched.

I am concerned about the URL being output in the cloud element itself, however. With a feedburner feed, the URL of the feed may not be on the main server itself. So using a relative URL instead of an absolute and fully qualified one concerns me.

It remains unclear to a feed client which server should be used in such a case. The correct answer is obviously my blog, but that information may not be contained in the feed itself. A full URL would be more comforting there.

Actually, wp-cron fires on any admin page as well. So after you publish, and the redirect back to the edit screen happens, wp-cron should fire off then, even before you view the post.

Never mind. I see that it contains the domain separately. Which strikes me as weird, but fine.

BTW, your AJAX Edit comments appears to be broken. I just get a blank frame, nothing in it.

I think there’s is more that needs to be explored for services like FeedBurner. For instance if the RSSCloud portion lists the FB feed URL, but sends out notifications that the feed has been updated before FB has polled for updates then clients that got the notification would grab the feed from FB and find that nothing has updated.

I’m not sure yet what the most correct answer is (there are a few different possibilities) in these situations, it’s worth having the discussion to get more input and decide on a recommendation.

then I’m sorry for my panicking in the post above 🙂

I really don’t see this in the terms of a notification system but a loosely-coupled/instant/asymmetric messaging system, aka “cloud-twitter”. So I’d like to take a whack at decoupling the plug-in from WordPress’ post infrastructure anyway… I’m guessing there is huge potential for this.

Anyway, as I didn’t say it before, let me say it now: thanks for publishing your work!

I have a blog with 16,000 subscribers that is updated 20 times a day. Around 1,000 of those subscribers use their own RSS software.

If all of them had RSS cloud-enabled clients, that’s 20,000 update notifications being sent in one day over XML-RPC, SOAP or REST. Most of them going to individual desktop computers that might be stuck behind firewalls, go offline, or just timeout the connection.

How do envision WordPress making all of the notification requests required for 7.5 million blogs? RSSCloud is an interesting approach, but the only company to try it shut down its cloud notification servers several years ago because of scaling woes. I go into this more on my blog at http://workbench.cadenhead.org/news/3555 .

Currently, there’s no reliable way to override the overridable functions from within another plugin because those functions are defined when their particular files are loaded, and their particular files are loaded when the plugin file itself is loaded (in alphabetical order of all WP plugins, I think?).

Instead, could you put the require statements in individual functions that are callbacks for something like the init event? Then another plugin could remove those callbacks and use its own.

I think the use case of desktop apps is certainly different. I desktop app that doesn’t have some web/server based component (even if it just tells it when there are updates) isn’t really addressed well in the current RSSCloud arrangement (from what I’ve seen/read). There are other approaches (as you mentioned in your post) that will likely work better in those situations.

On WordPress.com we’ve been dealing with growing scalable systems and I don’t think we’ll see a problem there from this. As Matt mentioned in a comment on your post we are already processing large amounts of data (posts, comments, email, XML-RPC requests, etc.) for our blogs. You can see some of the stats for WordPress.com at http://en.wordpress.com/stats/

The credit for making our large scale operations possible goes entirely to our sysadmins. Top notch, awesome people!

To your first point about overriding the functions. Yes and no. There is a reliable way, but it might not be obvious. WP in recent versions added support for the mu-plugins directory, from WPMU. You can easily drop in your own specific replacement functions in mu-plugins and those will get used instead.

Using an action handler instead would also provide a way to override code in the plugin. But if it was all still done with plugins then you could still run into a race issue depending on the order that the plugins were loaded. I’ll have to think about this more.

You can easily drop in your own specific replacement functions in mu-plugins and those will get used instead.

That’s true, but I’m trying to keep it easy for users. So, for example, I could distribute a supplement plugin that would provide a more robust data storage for high-capacity sites. If you go the mu-plugins route you lose most of the benefits of the WP plugin repository.

Using an action handler instead would also provide a way to override code in the plugin. But if it was all still done with plugins then you could still run into a race issue depending on the order that the plugins were loaded. I’ll have to think about this more.

Unless I’m missing something, I don’t think there would be a race condition except in the case of two other plugins competing to define, e.g., rsscloud_hub_process_notification_request(). Otherwise, if your plugin defines that function from a callback attached to “init”, another plugin simply has to call remove_action() at “plugins_loaded” in order to replace your plugin’s callback with its own.

[…] で、そのRSSCloudを自サイトなどにインストールしているWordPressでも有効にする(というか、<cloud ~ />で始まる要素を追加してくれる)プラグイン「RSS Cloud」が公開されています。どういうことをするものかなど、詳しいことは作者であるJoseph Scottさんのサイトの記事でご確認ください。 […]

This comes down to what problem are you trying to solve. If the problem is that you want to offload the management of notifications (sending them and processing new requests) then a separate, stand alone hub would probably be the best way forward. My initial design for the plugin included this ability (I think mentioned this in another comment here), although I chose to not worry about it at this point in favor of making things simple and functional. I expect that a future version of this plugin will have a simple settings page to configure a separate hub to use.

I’ve just been looking at the source code for the WordPress rssCloud plugin, and in the notification_request function, I see you check that the url being requested exactly matches the rss2 url for the blog and refuse to accept the registration if not. The problem is that the check is made after calling back to the client’s notify url and you always return a “Registration successful” regardless. As a result there is no way for the client to tell that the registration has failed.

This is a fairly common problem for people that use feedburner. In many cases the client will be subscribed to the feedburner url rather than the original WordPress feed url, which means the cloud registration is guaranteed to fail silently (assuming I’ve understood the code correctly).

One possible fix would be to improve the error checking so the client is at least informed of the failure. A nicer solution would be to accept any url that was requested in the registration since, at least in your case, the registration url is always unique for a particular blog is it not? The feed url that the client is using shouldn’t matter.

Of course the next problem with feedburner feeds is that there is a delay between when the blog is updated and the feedburner feed actually reflects those changes. If a client receives a notification that the feed has updated and immediately refreshes that feed, the data it gets back will be out of date, so it won’t see the update.

It’s possible this can be solved by WordPress automatically pinging feedburner (if it doesn’t already) and a short delay in the client to allow time for the feedburner version of the feed to refresh. However we won’t know how workable that solution is until problem number one has been addressed.

The FeedBurner case is one I’m looking at. WordPress does provide a way for plugins & themes to change the feed URL, so the plugin could potentially just look there instead. That doesn’t solve the problem of outdated data being provided by FeedBurner (as you described).

I should have an update to the rssCloud plugin this week that will be better about matching the feed URL to blog and provide more helpful error messages.

Joseph,

Thank you very much for your post. I have several blogs using wordpress.com and just recently decided to take the plunge and try to host my own wordpress.org installation.

My initial reaction was that I was very upset that all of my post on wordpress.com were updated to Google instantly, but that it would sometimes take days for the post on my new site josh.net to be recognized by Google.

I did a lot of searching and could not figure it out. I double checked and ping-o-matic was working, but did not make a difference, then as I was about to give up I came across your blog.

So I wanted to thank you very, very much!! RSSCloud seems to be working fantastic for me. I’ll be sure to let you know if I have any additional feedback.

Best regards,
Josh

Leave a Reply

Your email address will not be published. Required fields are marked *