The standard lib os.walk() function provides a topdown parameter that determines whether entries are yielded in a top-down or a bottom-up order. Sometimes though you may want each directory yielded twice; once before any of its children directories (and recursively their descendants) are yielded and once after they are all yielded. The walk2() function below does this by yielding 4-tuples; the first 3 elements are the same yielded by os.walk() and the 4th is True the first time (topdown) and False the second (bottomup).
An example is deleting all .pyc files and empty directories under some root dir, but excluding specific directories (e.g. VCS specific dirs). The exclusion check should be done topdown (we don't want to descend into any directory that must be excluded) but the check for empty directories has to be done bottom up, since a directory containing only .pyc files will be non-empty initially but empty after removing the files.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
#!/usr/bin/env python def cleanpycs(root, exclude='CVS .svn .git .hg .bzr'.split()): '''Deletes .pyc files and empty directories under this directory. Directories in ``exclude`` are not traversed. ''' from os.path import join, normpath from os import listdir, remove, rmdir exclude = frozenset(exclude) for dir, subdirs, files, top in walk2(root): if top: for f in files: if f.endswith('.pyc'): remove(join(dir,f)) subdirs[:] = [d for d in subdirs if d not in exclude] elif not listdir(dir): rmdir(dir) def walk2(top, onerror=None, followlinks=False): '''Simultaneous topdown and bottomup version of os.walk. For each directory in the directory tree rooted at top (including top itself, but excluding '.' and '..'), yields a 4-tuple:: dirpath, dirnames, filenames, top The triples (``dirpath``, ``dirnames``, ``filenames``) are the same yielded by os.walk, however each such triple is yielded twice, once before the subtree rooted at ``dirpath`` is visited (``top=True``) and once after (``top=False``). As with os.walk with ``topdown=True``, the caller can modify the dirnames list in-place when ``top=True`` (e.g., via del or slice assignment), and walk will only recurse into the subdirectories whose names remain in dirnames. Modifying dirnames when ``top=False``is ineffective, since the directories in dirnames have already been generated. ''' from os import listdir, error from os.path import join, isdir, islink try: names = listdir(top) except error, err: if onerror is not None: onerror(err) return dirs, nondirs = ,  for name in names: if isdir(join(top, name)): dirs.append(name) else: nondirs.append(name) yield top, dirs, nondirs, True for name in dirs: path = join(top, name) if followlinks or not islink(path): for x in walk2(path, onerror, followlinks): yield x yield top, dirs, nondirs, False if __name__ == '__main__': import sys cleanpycs(sys.argv if len(sys.argv) > 1 else '.')