Before overwriting an existing file it's often desirable to make a backup. This recipe emulates the behavior of Emacs by saving versioned backups. It's also compatible with the marshal module, so you can save versioned output in "marshal" format.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | #!/usr/bin/env python
"""This module provides a versioned output file.
When you write to such a file, it saves a versioned backup of any
existing file contents.
For usage examples see main()."""
import sys, os, glob, string, marshal
class VersionedOutputFile:
"""This is like a file object opened for output, but it makes
versioned backups of anything it might otherwise overwrite."""
def __init__(self, pathname, numSavedVersions=3):
"""Create a new output file.
`pathname' is the name of the file to [over]write.
`numSavedVersions' tells how many of the most recent versions
of `pathname' to save."""
self._pathname = pathname
self._tmpPathname = "%s.~new~" % self._pathname
self._numSavedVersions = numSavedVersions
self._outf = open(self._tmpPathname, "wb")
def __del__(self):
self.close()
def close(self):
if self._outf:
self._outf.close()
self._replaceCurrentFile()
self._outf = None
def asFile(self):
"""Return self's shadowed file object, since marshal is
pretty insistent on working w. pure file objects."""
return self._outf
def __getattr__(self, attr):
"""Delegate most operations to self's open file object."""
return getattr(self.__dict__['_outf'], attr)
def _replaceCurrentFile(self):
"""Replace the current contents of self's named file."""
self._backupCurrentFile()
os.rename(self._tmpPathname, self._pathname)
def _backupCurrentFile(self):
"""Save a numbered backup of self's named file."""
# If the file doesn't already exist, there's nothing to do.
if os.path.isfile(self._pathname):
newName = self._versionedName(self._currentRevision() + 1)
os.rename(self._pathname, newName)
# Maybe get rid of old versions.
if ((self._numSavedVersions is not None) and
(self._numSavedVersions > 0)):
self._deleteOldRevisions()
def _versionedName(self, revision):
"""Get self's pathname with a revision number appended."""
return "%s.~%s~" % (self._pathname, revision)
def _currentRevision(self):
"""Get the revision number of self's largest existing backup."""
revisions = [0] + self._revisions()
return max(revisions)
def _revisions(self):
"""Get the revision numbers of all of self's backups."""
revisions = []
backupNames = glob.glob("%s.~[0-9]*~" % (self._pathname))
for name in backupNames:
try:
revision = int(string.split(name, "~")[-2])
revisions.append(revision)
except ValueError:
# Some ~[0-9]*~ extensions may not be wholly numeric.
pass
revisions.sort()
return revisions
def _deleteOldRevisions(self):
"""Delete old versions of self's file, so that at most
self._numSavedVersions versions are retained."""
revisions = self._revisions()
revisionsToDelete = revisions[:-self._numSavedVersions]
for revision in revisionsToDelete:
pathname = self._versionedName(revision)
if os.path.isfile(pathname):
os.remove(pathname)
def main():
"""Module mainline (for isolation testing)"""
basename = "TestFile.txt"
if os.path.exists(basename):
os.remove(basename)
for i in range(10):
outf = VersionedOutputFile(basename)
outf.write("This is version %s.\n" % i)
outf.close()
# Now there should be just four versions of TestFile.txt:
expectedSuffixes = ["", ".~7~", ".~8~", ".~9~"]
expectedVersions = []
for suffix in expectedSuffixes:
expectedVersions.append("%s%s" % (basename, suffix))
matchingFiles = glob.glob("%s*" % basename)
for filename in matchingFiles:
if filename not in expectedVersions:
sys.stderr.write("Found unexpected file %s.\n" % filename)
else:
# Unit tests should clean up after themselves...
os.remove(filename)
# Finally, here's an example of how to use versioned
# output files in concert with marshal.
import marshal
outf = VersionedOutputFile("marshal.dat")
# Marshal out a sequence:
marshal.dump([1, 2, 3], outf.asFile())
outf.close()
os.remove("marshal.dat")
if __name__ == "__main__":
main()
|
When Emacs saves a file 'foo.txt' it first checks to see if 'foo.txt' already exists. If it does, then the current file contents are backed up. Emacs can be configured to use versioned backup files, so for example 'foo.txt' might be backed up to 'foo.txt.~1~'. If other versioned backups of the file already exist, Emacs saves to the next available version. For example, if the largest existing version number is 19, Emacs will save the new version to 'foo.txt.~20~'.
Emacs can also prompt you to "delete old versions" of your files. For example, if you save a file which has six backups, Emacs can be configured to delete all but the newest three backups.
The code in this recipe emulates the versioning backup behavior of Emacs. It saves backups with version numbers, e.g. backing up 'foo.txt' to 'foo.txt.~n~' where the largest existing backup number is n - 1. It also lets you specify how many "old" versions of a file to save. (A value <= 0 means not to delete any old versions.)
The marshal module lets you marshal an object to a file by way of the dump() function. But dump() insists that the file object you provide actually is a Python file object, rather than being some arbitrary object conforming to the file-object interface. The versioned output file shown in this recipe provides an asFile() method for compatibility with marshal.dump().