Oneliner to check for valid POSIX filenames
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | def isValidPosixFilename(name, NAME_MAX=255):
"""Checks for a valid POSIX filename
Filename: a name consisting of 1 to {NAME_MAX} bytes used to name a file.
The characters composing the name may be selected from the set of
all character values excluding the slash character and the null byte.
The filenames dot and dot-dot have special meaning.
A filename is sometimes referred to as a "pathname component".
name: (base)name of the file
NAME_MAX: is defined in limits.h (implementation-defined constants)
Maximum number of bytes in a filename
(not including terminating null).
Minimum Acceptable Value: {_POSIX_NAME_MAX}
_POSIX_NAME_MAX: Maximum number of bytes in a filename
(not including terminating null).
Value: 14
More information on http://www.opengroup.org/onlinepubs/009695399/toc.htm
"""
return 1<=len(name)<= NAME_MAX and "/" not in name and "\000" not in name
|
Defintions and specifications on More information on http://www.opengroup.org/onlinepubs/009695399/toc.htm
It would be nice if this recipe could be extended to the windows, mac platform. (and for the more popular filesystems found there)
Tags: files
Is that all ? Hmmmm, I work mainly in Unix environments (Solaris and Linux), and I find a lot more characters mucking up file names :
" ' , ` * and a few others.
These of course are the characters that the shell uses to do special things, so putting them in file names makes command-line tasks rather difficult.
invalid according to the standard. valid doesn't mean not 'mucking up'
/ and '\0' are indeed the only invalid chars
of course it's unwise to use many others, including the ones you mention.
Adding Windows and others. Will 8.3 filesystems be supported? There is a different valid character set for 8.3 and long filename.
Should this function accept Unicode strings and detect mapping failures to filesystems which use a local encoding?
Just one question. Why?
Isn't it easier to just try and create the file and catch any exception that occurs? Something like:
catches this (albeit in a rather nonspecific way -- you don't know if it was the filename that was invalid, the disk was full, etc) in a completely platform-neutral way.
Invalid chars one non-POSIX systems. Have you ever tried to rename a file with a '?' in it on windows?
I did -- and failed. Can happen, if you access your ext2 file system from windows.
What really came in handy would be a function, that strips a list of filenames of unsafe (that is, on all major filesystems) characters but maintains unambiguity.