Welcome, guest | Sign In | My Account | Store | Cart

This module allows you to handle and protect from inconsistencies sets of files that must all be saved at the same time. It does this be creating backups and temporary files. It has been moderately tested for bugs.

Python, 111 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
#!/usr/bin/python

"""
This module makes it possible to ensure that data is not corrupted
between files whose version numbers are not important but must be
synchronized. It protects these files from crashes. It does, however,
require that critical files do not disappear between executions.
It can only account for cases where the computer does not do what it
is told, not cases where it does what it is not told.

Copyright (c) 2008 Collin Ross Mechlowitz Stocks

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.
"""

import os

class SyncFiles(object):
    def __init__(self,pathcons,*paths):
        self.pathcons=pathcons
        self.paths=list(paths)
        self.TEMP="-temp~"
        self.BACK="-back~"
        self.PACKETSIZE=1<<20
    def _testfiles(self):
        if os.path.exists(self.pathcons):
            #update had not even been called yet, so temps are bad
            #and currs are not bad
            return False,False
        tempsaregood=True
        currsarebad1=False
        currsarebad2=False
        if not os.path.exists(self.pathcons+self.TEMP):
            tempsaregood=False
            currsarebad1=True
        else:
            currsarebad2=True
        for path in self.paths:
            if not os.path.exists(path+self.TEMP):
                tempsaregood=False
                currsarebad1=True
            else:
                currsarebad2=True
        return tempsaregood,(currsarebad1 and currsarebad2)
    def _safeupdatefrom(self,ext):
        for path in self.paths:
            fread=open(path+ext,"rb")
            fwrite=open(path,"wb")
            data=fread.read(self.PACKETSIZE)
            while data:
                fwrite.write(data)
                data=fread.read(self.PACKETSIZE)
            fread.close()
            fwrite.close()
    def openread(self):
        ret=[]
        tempsaregood,currsarebad=self._testfiles()
        if tempsaregood:
            self._safeupdatefrom(self.TEMP)
            for path in self.paths:
                ret.append(open(path,"rb"))
        elif currsarebad:
            for path in self.paths:
                ret.append(open(path+self.BACK,"rb"))
        else:
            for path in self.paths:
                ret.append(open(path,"rb"))
        return ret
    def openwrite(self):
        ret=[]
        firsttime=True
        for path in self.paths:
            if os.path.exists(path):
                firsttime=False
        if not firsttime:
            tempsaregood,currsarebad=self._testfiles()
            if tempsaregood:
                self._safeupdatefrom(self.TEMP)
            elif currsarebad:
                self._safeupdatefrom(self.BACK)
        open(self.pathcons,"wb").close()
        for path in self.paths:
            ret.append(open(path+self.TEMP,"wb"))
        return ret
    def update(self):
        os.remove(self.pathcons)
        open(self.pathcons+self.TEMP,"wb").close()
        for path in self.paths:
            try:
                os.remove(path+self.BACK)
            except OSError:
                pass
        for path in self.paths:
            try:
                os.rename(path,path+self.BACK)
            except OSError:
                pass
        os.remove(self.pathcons+self.TEMP)
        for path in self.paths:
            os.rename(path+self.TEMP,path)

Say, for a moment, you have a program that does some calculations that change its state over time. However, you know that it is possible for the computer to shut down or otherwise terminate your program while it is executing at some unknown instruction. You therefore decide that you want to save your program's state to disk every so often. You may need one or more files in order to do this. But what if your program exits after opening the file but before writing to it? You have now not only destroyed the program's state in memory, but also erased your previous backup from the disk.

Remember that exception handling only gets you so far: Certain signals in Unix will abort the program immediately without giving it a chance to clean up, and if the computer does a hard reset, your program doesn't have a chance no matter what the platform.

This module solves that problem even if you need to store your program's state in multiple files (for whatever reason you may need to do this). It makes sure that there is always a valid set of files that you can fall back to no matter where in the program you are when it terminates.

It does not, however, account for cases where certain critical files disappear or become corrupted due to other problems than program termination. In those cases, you would probably have to restore by hand or start your program over from the beginning. Be careful about what files you delete. (Do not allow any idiot to do rm $DIR/*~ where $DIR is the directory with your program's files, because that will surely do you in.)

Although this program will try to restore the most recent set of files, it errs on the side of safety and could possibly go backwards up to two versions if it is not certain that the most recent ones are consistent.

In the future, I plan to implement the ability to choose a specific directory in which to put the temporary and backup files, instead of putting them in the same directory as the normal files (which, in some cases for which this program accounts, may be corrupt). This should help with the rm $DIR/*~ idiots.

I chose to implement this as a class so that it could be more extensible. I also used methods such that code would not be repeated and changes to those segments need only be done once.