Problems With libxml2 For WordPress XML-RPC Users

Updates
3 Feb 2009 @ 2:00pm : Update on libxml2 issues

4 Mar 2009 @ 2:38pm : Conclusion of libxml2 Issues – Use PHP 5.2.9 & libxml2 2.7.3

17 Mar 2009 @ 3:05pm : WordPress & libxml2 Episode IV: A New Plugin

A gradually growing list of people have run into a very odd problem using XML-RPC methods in WordPress, where the left angle bracket ( < ) gets stripped. There's been a fair bit of discussion about this on ticket #7771. The bottom line: the behavior of the PHP XML extension when built against newer versions of libxml2 changed, such that left angle brackets get stripped when parsing XML.

There’s been some back and forth between libxml2 folks (email list) and the PHP folks (bug 45996), with no real solution for those using the tainted versions of libxml2. So what are your options if you’ve got this problem? Here’s two:

  • Stick with older, known to work versions of libxml2. It’s been reported by others that libxml2 <= 2.6.32 work. I've personally only tested up to 2.6.30, which has been working fine for me.
  • Build the PHP XML module against the expat parser instead of libxml2

Both of these options require some server admin abilities and know how, making them unrealistic options for many WordPress users. Undoubtedly many hosting services will role out these newer versions of libxml2 as part of their regular updates. This will leave some WordPress users with sudden errors that weren’t there before.

As this was spurred by a change in behavior by libxml2, I think the ideal solution would be to provide a backwards compatible mode that would restore the old parsing mechanism (you know, the one that doesn’t strip angle brackets). Short of that happening perhaps the XML extension for PHP will need to grow to work correctly with the new way that libxml2 works. Either way, I’d like to see PHP XML parsing work correctly again.

If you aren’t having any of these problems right now I recommend NOT upgrading libxml2 on your system until this has been sorted out.

38 Comments

  1. “As this was spurred by a change in behavior by libxml2, I think the ideal solution would be to provide a backwards compatible mode that would restore the old parsing mechanism (you know, the one that doesn’t strip angle brackets).”

    In the thread you linked to, Rob Richards mentioned[1] that it was due to a gross hack in the PHP XML extension that this broke:

    “So basically the extension was using voodoo code to get the entities to work as it wanted them to and it has finally caught up with it.”

    Instead of being snippy at the libxml2 guys, you should direct your anger towards the PHP maintainers.

    [1]: http://thread.gmane.org/gmane.comp.gnome.lib.xml.general/14595/focus=14610

  2. I was going to say that large chunks of your post were missing, making it unreadable in my RSS reader (Bloglines), but after clicking through here I found that Bloglines has a bug where it strips left angle brackets, and most of the paragraphs that follow them.

    I found that strangely ironic…

  3. While I haven’t delved too deeply it seems like both libxml and php camps admit it was cause php’s code was a hack that finally caught up with them. So to me it seems like they should probably be the ones to either update their code to handle that libxml is exhibiting more proper behavior and isn’t so easy to fool into acting like expat, or providing a patch that can enable a compatible mode. Though either way it would be nice to see a fix, it’s been a known issue for way too long.

    I have some sites on Joyent, switching the php version from 5 to 4 worked to keep this vital feature working in the meantime.

  4. That may be the case that PHP is using a gross hack, but that doesn’t change the fact that libxml2 2.6.30 works fine and libxml2 2.7.1 doesn’t, with the exact same version of PHP. Sounds like you didn’t read the rest of my post, where I clearly stated that the bottom line was to make PHP XML parsing working correctly. I don’t have an interest in “being snippy”, simply getting things working again.

    At this point the problem is only going to get worse as more hosts update libxml2 we’ll see more and more people complain that XML-RPC in WordPress is suddenly broken.

  5. Agreed, it just needs to get fixed.

  6. And people say you shouldn’t use regex for these things. 😉

  7. For parsing XML-RPC regex’s probably wouldn’t be to bad. For XML in general? That would probably be one monster of a regex.

  8. And who is working on it? WordPress? or other developers?

  9. From what I can see it isn’t clear that anyone is working on fixing the situation with PHP and libxml2.

  10. Hi Joseph,

    Is there a solution to this in the end, or do we just wait it out?

  11. If your current setup is working, don’t upgrade libxml2. If isn’t working employ one of the work arounds (libxml2 download, or build against expat) and start bugging the PHP and libxml2 folks to get this sorted out. Someone from libxml2 and/or PHP needs to get this addressed.

  12. Current setup is messed up.

    For those interested, I came across http://blog.code-head.com/fixing-libxml-php-bug-and-issues-with-html-entities-downgrading-libxml which details the process of downgrading on cPanel servers.

  13. Here are my notes on downgrading libxml2 for OpenSuse 11.1. This assumes you have administrative rights on the box:

    http://www.peteware.com/blog/2009/01/fixing-libxml2-php-wordpress-and-the-missing-angle-brackets/

  14. Thanks for the link. Since downgrading libxml2 is a common fix for this issue it’s helpful to have how-to guides for various systems.

  15. Thanks, looks like a good resource for cPanel users who run into this.

  16. No problem. I ran the setup as mentioned in the link on my VPS and everything works perfectly now.

    Unfortunately, if you don’t have root access you’re kind of stuck 🙁

  17. This is the workaround patch for wordpress users.If you can not downgrade libxml version or can not wait libxml2 fix it later.

    http://blog.hoofoo.net/2009/01/14/wordpress-patch-for-problamatic-libxml2-version/

  18. I’ve not tested these patches to see if they cause other problems, so be careful.

  19. I’ve modified the three zip files and provided them as a single download.

  20. Just be careful, doing transformations like that might have other consequences.

  21. I’ve tried the patch on a test blog and it seems to work OK.

    I noticed, however, that … even though my system is reporting that libxml2 2.7.2 is installed, php is reporting (via phpinfo) that it thinks that 2.6.32 is installed.

    I have no idea why this is.

  22. My concern for these patches that do global search and replace like that is the potential to mess up other types of data.

    Having you tried restarting your web server? That might get PHP to pick up on the correct library version.

  23. Hi Joseph,

    Considering that there is no solution from PHP folks and libxml folks in the near future, is it possible for a similar workaround be put into the WordPress core?

  24. It’s possible, but awkward. In general trying to fix your foundation while standing on it is not an approach I’d like to take. I’ll try to make some time to test out those patches.

  25. Anyone have any luck with the patches? I’ve tried both manually editing the files (the HOO FOO method I call it) and the zip file method (the AJAY method) and neither work. Can anyone shed some light my way?

  26. So far I’ve seen conflicting reports as to how well these patches work. I’m going to setup a test system to try them out and see for myself.

  27. I’m seeing more failures than successes so far… 🙁

  28. I’ll contact my webhost to update the library but came by to thank you for your plugin. This works!
    http://wordpress.org/extend/plugins/libxml2-fix/

    I found several links to manual fixes which comment out certain code but that didn’t work out for me.

    So thank you!

  29. I still have this issue witch correct versions.

    Please check this topic:
    http://wordpress.org/support/topic/193720?replies=8

    or go here to check how the rss appears:

    http://maugustyniak.corpface.com/picasa/

    thanks for any suggestions

  30. Double check that your PHP is actually built against the proper version of libxml2. I’ve seen some cases where libxml2 was updated, but PHP was still using an older version.

  31. I checked that in phpinfo()

    it use correct version.

  32. Since you are using fetch_rss() did you check to see if it was a using a cached copy of the feed? If it cached the feed while having libxml2 problems then it might be still using it.

  33. very interesting !

    I replaced the url of the rss from the same user picasa account and this WP took it.

    Caching old rss maybe possible, but have no idea how to clear it.

    Thanks for help and responds.

  34. I figured that it cache in DB in wp_options table, I found my record, but before I changed url of rss to see is it actually caching it there. New record showed up, awesome.

    So I removed the new record and that bad one.

    Now it does not cache at all, my rss feed won’t show up because fetch_rss() return false.

    I wonder why it doesn’t want to cache new feeds

    Any ideas

    I know that this is not right place to ask about that but you may know the solution.

    Thanks

  35. I’d have to look through that chunk of code to figure out why it isn’t trying to refresh the cache for that feed. If I were to guess there might be another option that lists what feeds have been cached and where in the options table they are. If that’s the case and you only deleted the cached feeds then it might think that the feed is cached, but then find nothing in the cache.

  36. yeah , there is another option timestamp for that feed, deleted it didn’t help, I will look at it tomorrow yet.

    I just thought that there are WP functions which deals with caching.

    I played also with code little bit, found cache_age vars or other which may be responsible for keeping the cache but there must be something else yet like you said.

  37. wheee

    I figured that finally, basically went through the code and could display the status var of rss object, it says 500. Then I could finally display the error and it says limit 2s timeout, the time out of fetching the rss, and this hit me , because this rss is pretty big, so increasing to 5s fixed the problem 🙂

    thanks Joseph to figure the problem

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2018 Joseph Scott

Theme by Anders NorénUp ↑