Categories
Posts

pressfs – A WordPress Filesystem

Here is something else I’ve been toying with: pressfs, a WordPress filesystem. Currently it exposes user, post, tag, and category data in a read-only filesystem.

The Short Version

This is a Python script that uses FUSE to expose data from a WordPress site as a filesystem.

For the impatient you can give this a try in just a few steps:

  • get the pressfs source code from Github
  • install and activate the pressfs WordPress plugin
  • copy example-config.ini config.ini
  • edit config.ini, set values in WordPress section
  • python pressfs.py /your/mount/point/

This code is still roughly beta quality, try it out on a test/dev WordPress install first. Authentication is just HTTP basic, so please use it over SSL. It also needs to use an administrator level WordPress account (something that might change in the future).

History

A little over two years ago I started thinking about writing a WordPress plugin to act as a WebDAV server so that we could expose media uploads as an easily mounted filesystem. I poked around a bit, but quickly put the idea on the shelf and worked on other things. Fast forward to early 2011, after I got raifbot into a functional state I started taking a more serious look at WordPress + WebDAV. As a concept this is definitely plausible, but after trying several different approaches I decided I wasn’t interested in dealing with the edge cases I ran into (nginx not having built in support for chunked encoding was particularly irksome).

Although I wasn’t happy with the WebDAV issues, I still kept thinking about ways we could expose WordPress resources as a filesystem. A few weeks later I started experimenting with the idea of using FUSE to expose WordPress data as a filesystem. It was a rough start, but I finally got it to a point where it was time to share with others and get more feedback.

Python

I don’t have much experience writing Python code, which isn’t surprising since my full time job has involved WordPress and WordPress related development (lots of PHP code). Sure, I’ve tweaked a few Python scripts from time to time, read Dive Into Python and toyed with some of the examples, but I’ve never sat down to write new code for a new project in Python until now. I welcome feedback on improving the code, just remember this is my first time sailing this ship, so be gentle 🙂

The requirements for running pressfs aren’t terribly exotic. Obvious you need to have the FUSE library installed and the fuse-python bindings. Other external libraries it uses include httplib2 and simplejson. I think all of the other imported code are standard for most python installs.

There are only a handful of examples and tutorials on getting started with fuse-python, which I had to experiment with repeatedly (along with tons of debugging) to figure out how the pieces fit together. I’ll likely be posting tutorials and how-to’s about python-fuse, both to help others and to make sure I have a clear idea of how the various features work.

The WordPress Plugin

I choose to write a new WordPress plugin to expose the data I needed because I wanted to be able to tweak it on a whim. Using JSON to exchange data was a nice bonus as well. So for now at least you’ll need to have this active on any site that you want to use pressfs with.

Directory Layout

If you have pressfs mounted on /var/wordpress you’ll see something like this:

  • /var/wordress/categories/
  • /var/wordpress/posts/
  • /var/wordpress/tags/
  • /var/wordpress/users/

The categories directory will list all of the categories, using the category slug. Each individual category is a directory with the following files in it: count, description, id, name, parent, and slug. These all contain what you would expect.

The posts directory lists the posts on your site, with the format of <POSTID>-<NAME> (or if NAME isn’t available <POSTID>-<TITLE>). Each post directory has the following files in it: content, date-gmt, id, name, password, status, title, type, and url. Shouldn’t be any surprises about the contents of each of those files.

The tags directory lists the tags on your site, using the slug. Each tag directory contains: count, description, id, name, and slug.

The users directory lists all the users on your site, with directory entries using the username field. Each user directory contains these files: display-name, email, id, login, nice-name, registered, and url.

Want to see the description field for category photos:
more /var/wordress/categories/photos/description

Want to see the post body for the post id 123, which has the name summer-vacation:
more /var/wordpress/posts/123-summer-vacation/content

Want the email field from user josephscott:
more /var/wordpress/users/josephscott/email

The Future

The largest limitation that I expect people to ask about is moving from read-only to allowing write operations. The good news is that I already have experimental code that allows for writes of specific fields. I have more testing and cleaning up of the code to do, but I intended to add write support in a future version.

Next up is exposing more WordPress data. I already have test code for exposing media upload files, so that will get added eventually. Beyond that I’d like to expose other less obvious items as well, like settings.

While I’ve revised this a few times before getting to this point, I’m not sure I’ve got everything just right. For instance, is the current directory layout really the best way to go? Don’t be surprised if there are significant changes as development goes along.

Conclusion

Now that I’ve gotten my feet wet with WordPress + python-fuse I’m excited about doing more with this. This was also a good way to start learning more Python, for something that resembles a real application, which I’ve wanted to do for some time now.

In the short term I’d like to hear how this works (or doesn’t) for others. If you run into a problem re-mount pressfs with python pressfs.py /your/mount/point/ -d, which will keep the process running in the foreground, with details on what each operation does (including errors). That should provide enough details to figure out where the problem is.

26 replies on “pressfs – A WordPress Filesystem”

Cool 🙂

At first, I wondered why a plugin on the WordPress side was needed (since you could have used XML-RPC). But then when you said you were using JSON, I understood. JSON is much easier to deal with, overall.

I’m sure you’ll be adding other stuff like post-meta, post categories and tags, and such later. Have you thought about handling Multisite networks as a single mount-point with sub-dirs for each sub-site? 🙂

Hmm, using this, with updatedb/locate, find, grep, etc., there could be interesting custom search engine possibilities…

Mounting all of the sites on a multisite install was actually one of the first things that I thought about when I first started with the fuse-python approach. I choose to put that off for now, mostly for the sake of simplicity. Definitely something I’d look at adding support for in the future.

It would be great to see people come new with ways to leverage WordPress data as a filesystem.

A large portion of this was just to experiment, so we’ll see. One area where I think this approach will be really helpful is managing media files (which it doesn’t support yet). Especially if you can upload a new image by just copying it into a special directory.

I’ve got most of the technical issues worked out for write support. Now it is more about trying to figure out the best way to expose and deal with the user level interactions. At some point, after more testing, I’ll probably just get it out there so that people can try it out. If I need to change it afterwards, so be it.

+1 for releasing the write version!
This sounds really great. Does it mean you would be able to edit/view posts as text files? And add/remove taxonomy terms by dragging things into different folders?

There will be a write version, needs more testing and clean up.

Exactly how you’d interact with the different pieces of data is one of the issues that needs to be sorted out. For the most part data will be exposed as plain text files, so you’ll likely be able to just edit them and have them update when you save & close.

This is a great idea, especially if we can move it into the murky world of WP media management that doesn’t seem to be getting an overhaul in 3.2 (wahh)

Any particular reason for doing this in Python vs. PHP?

I’m currently testing code to expose media uploads, first rev will be read only.

As for why this was done in Python, two reasons really:
– while there are PHP extensions for FUSE, the Python extensions looked much more mature
– good excuse to use Python for a real project

Would be really cool easily add a set of photos. Just export from photo management app and copy to pressfs mount.

Is any caching of the posts or archive pages done? If it could cope with the strain there’s no reason you couldn’t serve the content using mod_rewrite rules like Supercache does …

There is a very tiny amount of caching done on the pressfs side, under the Cache section in config.ini – “req_expire = 3”. What this does is tell the request method to cache the results of an API call for 3 seconds, so any requests during that time for the exact same API call will get the cached results back. I set this at 3 by default because I wanted just enough caching so that pressfs could get filesystem operations done with a minimum number of API calls, but not have to worry too much about serving stale data.

Caching is definitely an area that needs some exploration, my thoughts have mainly been around doing the caching inside WP.

Given write capability, this solves a stupendous number of problems for me. In fact, I’ve been toying with Jekyll and other ruby-based blogging and template engines for which I can deploy simply with ‘git push myblog master’. I want this *bad*.

How is this going as a released plugin? Is there anything available as a webdav file directory to handle media library uploads or the like?

Leave a Reply

Your email address will not be published. Required fields are marked *