Categories
FreeBSD Linux Mac OS X OS Windows

Everybody Caches

Caching is one of the techniques used in software and hardware to make things faster. We usually like things to be fast, so we are usually happy to use systems that cache in a variety of ways. Let’s face it, everybody caches.

If you’ve been living on this planet for awhile now you’ll recognize that it is pretty rare that you get something for nothing. It’s no different with caching. The OS on your computer caches reads and writes to your hard drive, and from time to time will flush that cache out to disk. It’s important that this flush happens so that when you loose power your data is on the hard drive, not memory. Except that everybody caches, this includes your hard drives. So when your OS goes to flush data to the disk, the disk caches it in its own memory and eventually flushes to the actual disk portion of the drive. Once again this is all down in the name of speed. And once again the biggest risk of this is loss of data when the power goes out. You want to note that this is why many RAID controllers will often have a small battery to keep that data they’ve cached in memory still there for a day or two, by which time we hope the power will have been restored.

Hopefully none of the above is news, this has been the situation for computers and hard drives for years. It is news for at least some folks though, other wise the Slashdot crowd wouldn’t be so surprised that your hard drive lies to you. This discussion seems to have been touched off by a tool that demos your hard drive caching and how you can turn it off in Linux. Brad wrote this tool as a result of LiveJournal’s outage, where data loss do to drive write caching seems to have been a pretty major problem for them. Not long after LJs problems Wikipedia was offline for the same reason.

Some of the comments on the Slashdot article about hard drives caching did provide good information. One comment pointed out caching issues mentioned on various man pages from Mac OS X, Linux and FreeBSD. This comment was interesting because it references another point in time where there was a lot of discussion about what do with hard drive caching and the risks there of. The FreeBSD man page quoted in the comment references FreeBSD 4.3, which was released four years ago. I remember the huge discussion that broke out about disabling hard drive caching in FreeBSD. Many argued that the risks of data of loss were just too great, but in the end the huge performance loss when turning off the hard drive caching was just too much bare.

Thankfully another comment also pointed to an Apple email about hard drive caching. If you deal with file storage go read that, it is very informative. The lesson learned is that for plain average systems with off the shelf hard drives, you may or may not be able to convince your hard drive to put data on to the disk manually.

Just remember, everybody caches, so plan accordingly.

One reply on “Everybody Caches”

As a SuSE newbe, we ran into the following problem dealing with data acquisition card under development, based on the PLX housekeeping chip. The problem appears to be associated with the OS and not our code, as we release memory at close. To wit:
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
My CS major Co-op students have developed a C++ program that works just fine. We use either one or two DMA buffers (two different programs – one uses hyperthreading) that set aside just enough memory for a DMA transfer for a single pulse echo, 512 2-byte words. When I watch the hard disk light operate during acquisition, it blinks about every five seconds, suggesting caching going on. We collect data for say 15 minutes in the extreme case, creating data files of 800 MB or so. The computer has 1 GB of memory. The first aqusition goes just fine. Checking the cache buildup during the acqusition, at it appears that the OS sets up a new block of cache for each DMA transfer, which I guess makes sense from its perspective, but then uses 800 MB of memory in the process. When the program runs a second time and builds another 800 MB file, I get gaps in the output data, as eveidence by the image we create, losing azimuth sectors of data, , suggesting trigger drops. My take on this is that it occurs when the OS runs out of fresh memory for caching, then has to go back and release previous cache, ignoring trigers while it releases cache and thus missing new data transfers.

Our solution has been to simply reboot the system after each acqusition.The collection occurs on the hr/hf-hr using CRON. You don’t even have to log on, and the collection runs regularly, then reboots. Clearly this is a clumsy workaround. Any sugestions?

Leave a Reply

Your email address will not be published. Required fields are marked *