Caching is one of the techniques used in software and hardware to make things faster. We usually like things to be fast, so we are usually happy to use systems that cache in a variety of ways. Let’s face it, everybody caches.
If you’ve been living on this planet for awhile now you’ll recognize that it is pretty rare that you get something for nothing. It’s no different with caching. The OS on your computer caches reads and writes to your hard drive, and from time to time will flush that cache out to disk. It’s important that this flush happens so that when you loose power your data is on the hard drive, not memory. Except that everybody caches, this includes your hard drives. So when your OS goes to flush data to the disk, the disk caches it in its own memory and eventually flushes to the actual disk portion of the drive. Once again this is all down in the name of speed. And once again the biggest risk of this is loss of data when the power goes out. You want to note that this is why many RAID controllers will often have a small battery to keep that data they’ve cached in memory still there for a day or two, by which time we hope the power will have been restored.
Hopefully none of the above is news, this has been the situation for computers and hard drives for years. It is news for at least some folks though, other wise the Slashdot crowd wouldn’t be so surprised that your hard drive lies to you. This discussion seems to have been touched off by a tool that demos your hard drive caching and how you can turn it off in Linux. Brad wrote this tool as a result of LiveJournal’s outage, where data loss do to drive write caching seems to have been a pretty major problem for them. Not long after LJs problems Wikipedia was offline for the same reason.
Some of the comments on the Slashdot article about hard drives caching did provide good information. One comment pointed out caching issues mentioned on various man pages from Mac OS X, Linux and FreeBSD. This comment was interesting because it references another point in time where there was a lot of discussion about what do with hard drive caching and the risks there of. The FreeBSD man page quoted in the comment references FreeBSD 4.3, which was released four years ago. I remember the huge discussion that broke out about disabling hard drive caching in FreeBSD. Many argued that the risks of data of loss were just too great, but in the end the huge performance loss when turning off the hard drive caching was just too much bare.
Thankfully another comment also pointed to an Apple email about hard drive caching. If you deal with file storage go read that, it is very informative. The lesson learned is that for plain average systems with off the shelf hard drives, you may or may not be able to convince your hard drive to put data on to the disk manually.
Just remember, everybody caches, so plan accordingly.