Categories
Posts

Problems With libxml2 For WordPress XML-RPC Users

Updates
3 Feb 2009 @ 2:00pm : Update on libxml2 issues

4 Mar 2009 @ 2:38pm : Conclusion of libxml2 Issues – Use PHP 5.2.9 & libxml2 2.7.3

17 Mar 2009 @ 3:05pm : WordPress & libxml2 Episode IV: A New Plugin

A gradually growing list of people have run into a very odd problem using XML-RPC methods in WordPress, where the left angle bracket ( < ) gets stripped. There's been a fair bit of discussion about this on ticket #7771. The bottom line: the behavior of the PHP XML extension when built against newer versions of libxml2 changed, such that left angle brackets get stripped when parsing XML.

There’s been some back and forth between libxml2 folks (email list) and the PHP folks (bug 45996), with no real solution for those using the tainted versions of libxml2. So what are your options if you’ve got this problem? Here’s two:

  • Stick with older, known to work versions of libxml2. It’s been reported by others that libxml2 <= 2.6.32 work. I've personally only tested up to 2.6.30, which has been working fine for me.
  • Build the PHP XML module against the expat parser instead of libxml2

Both of these options require some server admin abilities and know how, making them unrealistic options for many WordPress users. Undoubtedly many hosting services will role out these newer versions of libxml2 as part of their regular updates. This will leave some WordPress users with sudden errors that weren’t there before.

As this was spurred by a change in behavior by libxml2, I think the ideal solution would be to provide a backwards compatible mode that would restore the old parsing mechanism (you know, the one that doesn’t strip angle brackets). Short of that happening perhaps the XML extension for PHP will need to grow to work correctly with the new way that libxml2 works. Either way, I’d like to see PHP XML parsing work correctly again.

If you aren’t having any of these problems right now I recommend NOT upgrading libxml2 on your system until this has been sorted out.

54 replies on “Problems With libxml2 For WordPress XML-RPC Users”

“As this was spurred by a change in behavior by libxml2, I think the ideal solution would be to provide a backwards compatible mode that would restore the old parsing mechanism (you know, the one that doesn’t strip angle brackets).”

In the thread you linked to, Rob Richards mentioned[1] that it was due to a gross hack in the PHP XML extension that this broke:

“So basically the extension was using voodoo code to get the entities to work as it wanted them to and it has finally caught up with it.”

Instead of being snippy at the libxml2 guys, you should direct your anger towards the PHP maintainers.

[1]: http://thread.gmane.org/gmane.comp.gnome.lib.xml.general/14595/focus=14610

I was going to say that large chunks of your post were missing, making it unreadable in my RSS reader (Bloglines), but after clicking through here I found that Bloglines has a bug where it strips left angle brackets, and most of the paragraphs that follow them.

I found that strangely ironic…

While I haven’t delved too deeply it seems like both libxml and php camps admit it was cause php’s code was a hack that finally caught up with them. So to me it seems like they should probably be the ones to either update their code to handle that libxml is exhibiting more proper behavior and isn’t so easy to fool into acting like expat, or providing a patch that can enable a compatible mode. Though either way it would be nice to see a fix, it’s been a known issue for way too long.

I have some sites on Joyent, switching the php version from 5 to 4 worked to keep this vital feature working in the meantime.

That may be the case that PHP is using a gross hack, but that doesn’t change the fact that libxml2 2.6.30 works fine and libxml2 2.7.1 doesn’t, with the exact same version of PHP. Sounds like you didn’t read the rest of my post, where I clearly stated that the bottom line was to make PHP XML parsing working correctly. I don’t have an interest in “being snippy”, simply getting things working again.

At this point the problem is only going to get worse as more hosts update libxml2 we’ll see more and more people complain that XML-RPC in WordPress is suddenly broken.

If your current setup is working, don’t upgrade libxml2. If isn’t working employ one of the work arounds (libxml2 download, or build against expat) and start bugging the PHP and libxml2 folks to get this sorted out. Someone from libxml2 and/or PHP needs to get this addressed.

No problem. I ran the setup as mentioned in the link on my VPS and everything works perfectly now.

Unfortunately, if you don’t have root access you’re kind of stuck 🙁

I’ve tried the patch on a test blog and it seems to work OK.

I noticed, however, that … even though my system is reporting that libxml2 2.7.2 is installed, php is reporting (via phpinfo) that it thinks that 2.6.32 is installed.

I have no idea why this is.

My concern for these patches that do global search and replace like that is the potential to mess up other types of data.

Having you tried restarting your web server? That might get PHP to pick up on the correct library version.

Hi Joseph,

Considering that there is no solution from PHP folks and libxml folks in the near future, is it possible for a similar workaround be put into the WordPress core?

It’s possible, but awkward. In general trying to fix your foundation while standing on it is not an approach I’d like to take. I’ll try to make some time to test out those patches.

Anyone have any luck with the patches? I’ve tried both manually editing the files (the HOO FOO method I call it) and the zip file method (the AJAY method) and neither work. Can anyone shed some light my way?

Double check that your PHP is actually built against the proper version of libxml2. I’ve seen some cases where libxml2 was updated, but PHP was still using an older version.

Since you are using fetch_rss() did you check to see if it was a using a cached copy of the feed? If it cached the feed while having libxml2 problems then it might be still using it.

very interesting !

I replaced the url of the rss from the same user picasa account and this WP took it.

Caching old rss maybe possible, but have no idea how to clear it.

Thanks for help and responds.

I figured that it cache in DB in wp_options table, I found my record, but before I changed url of rss to see is it actually caching it there. New record showed up, awesome.

So I removed the new record and that bad one.

Now it does not cache at all, my rss feed won’t show up because fetch_rss() return false.

I wonder why it doesn’t want to cache new feeds

Any ideas

I know that this is not right place to ask about that but you may know the solution.

Thanks

I’d have to look through that chunk of code to figure out why it isn’t trying to refresh the cache for that feed. If I were to guess there might be another option that lists what feeds have been cached and where in the options table they are. If that’s the case and you only deleted the cached feeds then it might think that the feed is cached, but then find nothing in the cache.

yeah , there is another option timestamp for that feed, deleted it didn’t help, I will look at it tomorrow yet.

I just thought that there are WP functions which deals with caching.

I played also with code little bit, found cache_age vars or other which may be responsible for keeping the cache but there must be something else yet like you said.

wheee

I figured that finally, basically went through the code and could display the status var of rss object, it says 500. Then I could finally display the error and it says limit 2s timeout, the time out of fetching the rss, and this hit me , because this rss is pretty big, so increasing to 5s fixed the problem 🙂

thanks Joseph to figure the problem

Leave a Reply

Your email address will not be published. Required fields are marked *