A generator that does what os.path.walk does, without the need for a callback function.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | from __future__ import generators # needed for Python 2.2
import os
import stat
import types
def walktree(top=".", depthfirst=False):
"""
Walk a directory tree, starting from 'top'
This code is based on os.path.walk, with the callback function
replaced by a yield and recursion replaced by iteration
"""
if type(top) != types.StringType:
raise TypeError("top must be a string")
# decide which end of the stack to pop
if depthfirst:
index = -1
else:
index = 0
dirstack = [top]
while dirstack:
top = dirstack.pop(index)
try:
names = os.listdir(top)
except os.error:
return
yield top, names
dirs = []
for name in names:
name = os.path.join(top, name)
try:
st = os.lstat(name)
except os.error:
continue
if stat.S_ISDIR(st.st_mode):
dirs.append(name)
# the depth-first results look 'backwards' to me
# so I'm changing them to fit my expectations
if depthfirst:
dirs.reverse()
dirstack.extend(dirs)
if __name__ == "__main__":
"""
a sample:
look for python scripts starting in the current directory
"""
for top, names in walktree():
print "searching folder %s" % (top,)
for name in names:
if name[-3:] == ".py":
print "I think %s is a python script" % (name,)
|
This is a rewrite of os.path.walk. It's convenient for processing files that inhabit directory trees. Its original purpose was to vist the '.h' files of a large C++ project and wrap them in#ifdef include guards.
Thanks to Tim Peters for pointing out that using an internal stack instead of recursion gives one the choice of depth-first or width-first visits.
Thanks to Raymond Hettinger for pointing out st.st_mode and urging me to get rid of the global search mode definitions.
Thanks to Noah Spurrier for suggesting a sample.
Neat, but give an example! This looks like a neat use of generators, but it needs an example of its use. Maybe give an example of converting all filenames to lowercase or something.
Depth First is broken. I just noticed that Depthfirst is broken. It still walks the tree breadth first. A fix is not easy as it would involve recursion or an additional stack. I'm not sure how recurssion works with generators.
For example, take the following directory tree:
This structure can be created using these commands:
Then run this python code to walk the tree:
The expected output for depth first tree traversal is this:
But it does not produce this. The actual output is below. This is actually still BREADTH first, but following the RIGHTMOST path first.
With depthfirst set to False you still get BREADTH first, but following the LEFTMOST path first:
Yours, Noah
Fix for depth first problem. The depth first problem is important. Consider how would you easily write a directory walker to rename all files AND directories to lower case? The obvious code does not do what one would expect:
That will try to rename the directories in the wrong order. It will try to rename a directory before it has visited the children of that directory. Given the following directory tree:
That code will try to do this:
then later:
But AAA/CCC does not exist anymore. It was previously renamed to AAA/ccc!
The easiest way to fix walktree() is to use recursion. It's interesting to note that generators may be nested recursively. The following code fixes walktree to make it work properly for depth first tree walking.
Yours,
Noah Spurrier
os.walk. Also, see 'os.walk', added in Python 2.3