Saving the user's data is risky. If you write to a file directly, and an error occurs during the write, you may corrupt the file and lose the user's data. One approach to prevent this is to write to a temporary file, then only when you know the file has been written successfully, over-write the original. This function returns a context manager which can make this more convenient.
Update: this now uses
os.replace when available, and hopefully will work better on Windows.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94
import contextlib import os import stat import sys import tempfile @contextlib.contextmanager def atomic_write(filename, text=True, keep=True, owner=None, group=None, perms=None, suffix='.bak', prefix='tmp'): """Context manager for overwriting a file atomically. Usage: >>> with atomic_write("myfile.txt") as f: # doctest: +SKIP ... f.write("data") The context manager opens a temporary file for writing in the same directory as `filename`. On cleanly exiting the with-block, the temp file is renamed to the given filename. If the original file already exists, it will be overwritten and any existing contents replaced. (On POSIX systems, the rename is atomic. Other operating systems may not support atomic renames, in which case the function name is misleading.) If an uncaught exception occurs inside the with-block, the original file is left untouched. By default the temporary file is also preserved, for diagnosis or data recovery. To delete the temp file, pass `keep=False`. Any errors in deleting the temp file are ignored. By default, the temp file is opened in text mode. To use binary mode, pass `text=False` as an argument. On some operating systems, this make no difference. The temporary file is readable and writable only by the creating user. By default, the original ownership and access permissions of `filename` are restored after a successful rename. If `owner`, `group` or `perms` are specified and are not None, the file owner, group or permissions are set to the given numeric value(s). If they are not specified, or are None, the appropriate value is taken from the original file (which must exist). By default, the temp file will have a name starting with "tmp" and ending with ".bak". You can vary that by passing strings as the `suffix` and `prefix` arguments. """ t = (uid, gid, mod) = (owner, group, perms) if any(x is None for x in t): info = os.stat(filename) if uid is None: uid = info.st_uid if gid is None: gid = info.st_gid if mod is None: mod = stat.S_IMODE(info.st_mode) path = os.path.dirname(filename) fd, tmp = tempfile.mkstemp(suffix=suffix, prefix=prefix, dir=path, text=text) try: replace = os.replace # Python 3.3 and better. except AttributeError: if sys.platform == 'win32': # FIXME This is definitely not atomic! # But it's how (for example) Mercurial does it, as of 2016-03-23 # https://selenic.com/repo/hg/file/tip/mercurial/windows.py def replace(source, destination): assert sys.platform == 'win32' try: os.rename(source, dest) except OSError as err: if err.winerr != 183: raise os.remove(dest) os.rename(source, dest) else: # Atomic on POSIX. Not sure about Cygwin, OS/2 or others. replace = os.rename try: with os.fdopen(fd, 'w' if text else 'wb') as f: yield f # Perform an atomic rename (if possible). This will be atomic on # POSIX systems, and Windows for Python 3.3 or higher. replace(tmp, filename) tmp = None os.chown(filename, uid, gid) os.chmod(filename, mod) finally: if (tmp is not None) and (not keep): # Silently delete the temporary file. Ignore any errors. try: os.unlink(tmp) except: pass
When saving data to an existing file, we may want to guarantee that the file contents are consistent: either the save operation succeeded, completely replacing the old contents with the new, or it did not modify the file. The alternative is to risk losing data. Consider this naive approach:
with open("myfile.txt", "w") as f: # At this point, myfile has been opened for writing, # deleting all existing content. f.write(newdata)
If an error occurs while writing the new data, the old data has been lost, but the new data may be in a corrupt, inconsistent, or incomplete state. Likewise if an error occurs when the file is closed.
atomic_write function, we can be sure that either the re-write will succeed completely, or it won't occur at all:
with atomic_write("myfile.txt", "w") as f: f.write(newdata)
The trick is to write to a temporary file, and only if that succeeds do we replace the original with the temporary file using the
rename is guaranteed to be atomic on POSIX systems such as Linux and Unix (with a few provisos that are not important here). On other operating systems, such as Windows, it may or may not actually be atomic. By "atomic", I mean that during the renaming there is never a time where
new does not exist.
UPDATE: This will now use the
os.replace call in preference to
os.rename if it is available (Python 3.3 or higher). This should be atomic for all support operating systems, including Windows.
One side-effect of this is that it will disrupt hard links. If "yourfile" is a hard link to myfile.txt, then after the atomic write completes, the link will be broken: yourfile will continue to contain the old contents of the file, and myfile the new. (This is how hard links work.)
If an error occurs after renaming the file, but before the file permissions can be reset, you may have to manually reset the file permissions.
-os.chown is unix-only -if you call .close() on a TemporaryFile object it cleans itself up, no need to unlink -os.fdopen takes only bytestrings IIRC. Unicode strings also ought to work, including on Windows (yes, that sucks)
Sorry, the last point is BS, confused with fopen
The documentation for tempfile.mkstemp states that "Caller is responsible for deleting the file when done with it." which is exactly the behaviour I want. If the temp file is deleted when closed, you cannot rename the temp file and overwrite the original, as it will have been deleted.
Unfortunately, I don't have access to a Windows machine where I can test the behaviour under Windows, but as far as I can tell, this will not actually be atomic on Windows unless you're using Python 3.3 or better.
For some idea of how hard it is to get a cross-platform, file-system-independent atomic write, have a look at these issues on the bug tracker: