Welcome, guest | Sign In | My Account | Store | Cart

Before overwriting an existing file it's often desirable to make a backup. This recipe emulates the behavior of Emacs by saving versioned backups. It's also compatible with the marshal module, so you can save versioned output in "marshal" format.

Python, 131 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
#!/usr/bin/env python
"""This module provides a versioned output file.

When you write to such a file, it saves a versioned backup of any
existing file contents.

For usage examples see main()."""

import sys, os, glob, string, marshal

class VersionedOutputFile:
    """This is like a file object opened for output, but it makes
    versioned backups of anything it might otherwise overwrite."""

    def __init__(self, pathname, numSavedVersions=3):
        """Create a new output file.
        
        `pathname' is the name of the file to [over]write.
        `numSavedVersions' tells how many of the most recent versions
        of `pathname' to save."""
        
        self._pathname = pathname
        self._tmpPathname = "%s.~new~" % self._pathname
        self._numSavedVersions = numSavedVersions
        self._outf = open(self._tmpPathname, "wb")

    def __del__(self):
        self.close()

    def close(self):
        if self._outf:
            self._outf.close()
            self._replaceCurrentFile()
            self._outf = None

    def asFile(self):
        """Return self's shadowed file object, since marshal is
        pretty insistent on working w. pure file objects."""
        return self._outf

    def __getattr__(self, attr):
        """Delegate most operations to self's open file object."""
        return getattr(self.__dict__['_outf'], attr)
    
    def _replaceCurrentFile(self):
        """Replace the current contents of self's named file."""
        self._backupCurrentFile()
        os.rename(self._tmpPathname, self._pathname)

    def _backupCurrentFile(self):
        """Save a numbered backup of self's named file."""
        # If the file doesn't already exist, there's nothing to do.
        if os.path.isfile(self._pathname):
            newName = self._versionedName(self._currentRevision() + 1)
            os.rename(self._pathname, newName)

            # Maybe get rid of old versions.
            if ((self._numSavedVersions is not None) and
                (self._numSavedVersions > 0)):
                self._deleteOldRevisions()

    def _versionedName(self, revision):
        """Get self's pathname with a revision number appended."""
        return "%s.~%s~" % (self._pathname, revision)
    
    def _currentRevision(self):
        """Get the revision number of self's largest existing backup."""
        revisions = [0] + self._revisions()
        return max(revisions)

    def _revisions(self):
        """Get the revision numbers of all of self's backups."""
        
        revisions = []
        backupNames = glob.glob("%s.~[0-9]*~" % (self._pathname))
        for name in backupNames:
            try:
                revision = int(string.split(name, "~")[-2])
                revisions.append(revision)
            except ValueError:
                # Some ~[0-9]*~ extensions may not be wholly numeric.
                pass
        revisions.sort()
        return revisions

    def _deleteOldRevisions(self):
        """Delete old versions of self's file, so that at most
        self._numSavedVersions versions are retained."""
        
        revisions = self._revisions()
        revisionsToDelete = revisions[:-self._numSavedVersions]
        for revision in revisionsToDelete:
            pathname = self._versionedName(revision)
            if os.path.isfile(pathname):
                os.remove(pathname)
                
def main():
    """Module mainline (for isolation testing)"""
    basename = "TestFile.txt"
    if os.path.exists(basename):
        os.remove(basename)
    for i in range(10):
        outf = VersionedOutputFile(basename)
        outf.write("This is version %s.\n" % i)
        outf.close()

    # Now there should be just four versions of TestFile.txt:
    expectedSuffixes = ["", ".~7~", ".~8~", ".~9~"]
    expectedVersions = []
    for suffix in expectedSuffixes:
        expectedVersions.append("%s%s" % (basename, suffix))
    matchingFiles = glob.glob("%s*" % basename)
    for filename in matchingFiles:
        if filename not in expectedVersions:
            sys.stderr.write("Found unexpected file %s.\n" % filename)
        else:
            # Unit tests should clean up after themselves...
            os.remove(filename)

    # Finally, here's an example of how to use versioned
    # output files in concert with marshal.
    import marshal

    outf = VersionedOutputFile("marshal.dat")
    # Marshal out a sequence:
    marshal.dump([1, 2, 3], outf.asFile())
    outf.close()
    os.remove("marshal.dat")

if __name__ == "__main__":
    main()

When Emacs saves a file 'foo.txt' it first checks to see if 'foo.txt' already exists. If it does, then the current file contents are backed up. Emacs can be configured to use versioned backup files, so for example 'foo.txt' might be backed up to 'foo.txt.~1~'. If other versioned backups of the file already exist, Emacs saves to the next available version. For example, if the largest existing version number is 19, Emacs will save the new version to 'foo.txt.~20~'.

Emacs can also prompt you to "delete old versions" of your files. For example, if you save a file which has six backups, Emacs can be configured to delete all but the newest three backups.

The code in this recipe emulates the versioning backup behavior of Emacs. It saves backups with version numbers, e.g. backing up 'foo.txt' to 'foo.txt.~n~' where the largest existing backup number is n - 1. It also lets you specify how many "old" versions of a file to save. (A value <= 0 means not to delete any old versions.)

The marshal module lets you marshal an object to a file by way of the dump() function. But dump() insists that the file object you provide actually is a Python file object, rather than being some arbitrary object conforming to the file-object interface. The versioned output file shown in this recipe provides an asFile() method for compatibility with marshal.dump().