
Mailman Archives, Why Not Mbox?

I recently went hunting for archives of some of the FreeBSD mailing lists. They are using Mailman to manage the lists, so I figured I’d start with the Mailman archive of In addition to being able to see emails sorted by thread, subject, author and date you can download a compressed file for all messages in one month. I naively assumed that this would be in mbox format. It’s close, but not quite.

There is no blank link before the From_ line that begins each email. Unfortunately I found this out the hard way, by trying to parse out each email from the file. The results were extremely disappointing, so I looked at the original file a little closer. I was less than thrilled to discover that this file has no discernible format, each email is simply appended to the previous one, making it close to mbox, but not actually mbox. If you are going to do something so close to mbox, why not go ahead and actually do mbox?

The plot thickens from here though. Mailman stores all of the emails sent to a list in order to build the web accessible archives. Take a wild guess what format is used for these internal archives? If you said mbox you get a gold star. So Mailman does store archives in mbox format, but your list subscribers can’t access it. Bummer.