A generator which provides a quick way to 'walk' a zip file archive. The generator can process multiple zip archives, i.e zip files which contain zip files. Inspired by the os.walk function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | import zipfile
import os
import cStringIO
def zipwalk(zfilename):
"""Zip file tree generator.
For each file entry in a zip archive, this yields
a two tuple of the zip information and the data
of the file as a StringIO object.
zipinfo, filedata
zipinfo is an instance of zipfile.ZipInfo class
which gives information of the file contained
in the zip archive. filedata is a StringIO instance
representing the actual file data.
If the file again a zip file, the generator extracts
the contents of the zip file and walks them.
Inspired by os.walk .
"""
tempdir=os.environ.get('TEMP',os.environ.get('TMP',os.environ.get('TMPDIR','/tmp')))
try:
z=zipfile.ZipFile(zfilename,'r')
for info in z.infolist():
fname = info.filename
data = z.read(fname)
extn = (os.path.splitext(fname)[1]).lower()
if extn=='.zip':
checkz=False
tmpfpath = os.path.join(tempdir,os.path.basename(fname))
try:
open(tmpfpath,'w+b').write(data)
except (IOError, OSError),e:
print e
if zipfile.is_zipfile(tmpfpath):
checkz=True
if checkz:
try:
for x in zipwalk(tmpfpath):
yield x
except Exception, e:
raise
try:
os.remove(tmpfpath)
except:
pass
else:
yield (info, cStringIO.StringIO(data))
except RuntimeError, e:
print 'Runtime Error'
except zipfile.error, e:
raise
if __name__=="__main__":
import sys
for i,d in zipwalk(sys.argv[1]):
print i.filename
|
I have to write Python scripts at work to process zip files which sometimes contain child zip files. I find this recipe very useful for working with such multiple zip file archives. A generator is quite handy, since a recursive function for the same can often exhaust the stack memory.
Finally found this. Thanks !