> Queue Jeopardy music and image of Alex Trebek <
Space, mixed case, slash, backslash, question mark, colon, asterisk, quotation mark and control codes.
What are, things that shouldn’t be in file names?
Okay, all kidding aside, having goofy file names can make life miserable. On most days I move between Mac OS X (HFS+), Windows XP (mostly NTFS, some FAT32), Windows 2003 (NTFS), FreeBSD (UFS/UFS2) and Linux (pick one). Most filesystem interaction falls into two camps: NTFS and Unix like. Sadly they have different limitations on file names, and worse yet they have drastically different social norms for allowed file names.
So here is my list of things that should not appear in file names:
Spaces : Both camps support spaces in file names, but it is generally frowned upon in the Unix camp. Those using NTFS are generally encouraged to use spaces. Including spaces in a file name is a pain because they’ve have to be escaped. Fortunately they are easy enough to spot.
Mixed case : The NTFS camp is case preserving, while the Unix camp is case sensitive. Moving files from Unix to NTFS can be unpleasant if you have to rename several files because they only differ by case. Please just make you life easier an use lower case characters for file names unless you have a compelling reason not to (which there are).
Slash and Backslash : NTFS uses backslash as a directory separator and Unix uses forward slash. Neither of them have any business being used in a file name. You’ve been warned.
Question Mark and Asterisk : Both of these characters are meant to be used as wild cards, not as characters in a file name.
Colon and Vertical Bar : I can understand why these may be tempting to use, but please don’t. Colon is a problem in NTFS and vertical bar is used for pipes.
Quotation Mark (double and single) : Quotes are used for grouping on the command line. These are worse than spaces because there really is no reason to use them in a file name.
Trailing Period : After my recent run in with a trailing period I have a special dislike for them. For Windows systems the last period is usually followed by a three character extension, so having a period as the last character will only confuse things. I’d put leading period in here as well, but the use of that has been long establish in Unix systems for semi-hidden files.
Trailing Space : Like the trailing period, I have a special place in my heart for trailing spaces. This merits a specific mention because detecting that a file name has one is not easy to do visually.
Greater Than and Less Than : Really, why would you do this to your poor sysadmin? These characters are used for redirecting input and output of a program on the command line.
And the worst possible offender:
Control Codes : Most Unix systems are kind enough to allow just about anything in a file name. Unfortunately this means that control codes (except for NULL) are allowed. To do include one of these is just plain evil. I really don’t want to hear the BELL beep as part of the file name. Sure it’s funny once, after that is pure, unrefined annoyance.
When it comes to name your files you should be descriptive, brief and conservative. Ideally this means a simple series of lower case letters, possibly separated by a dash (-) or and underscore(_) that isn’t absurdly long. If you are using Windows then you’ll also include a period followed by three characters that are determined by the type of file you are naming.
By keeping your file names simple and consistent you’ll save yourself a lot of headaches.