Uses a recursively called simple generator (with the same arguments as the outer call!) to traverse a tree in breadth first order.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
def breadth_first(tree,children=iter): """Traverse the nodes of a tree in breadth-first order. The first argument should be the tree root; children should be a function taking as argument a tree node and returning an iterator of the node's children. """ yield tree last = tree for node in breadth_first(tree,children): for child in children(node): yield child last = child if last == node: return
This is an example of the self-recursive generators discussed in http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/221457
A little care is needed (the lines containing variable "last") to avoid getting into an infinite recursion for trees that are not themselves infinite. The recursive call is always traversing one level higher than the outer call, so if the two calls ever reach a situation where both have the same most-recently-output node, we must have traversed the whole tree. If the tree has height h, nodes at distance d from the root are traversed by h-d instances of the generator. When the number of nodes grows by at least a constant factor in each level (e.g. complete binary trees) it takes only constant time per tree node on average. Unlike the usual queue-based BFS, the space used is only O(h).
Iterative deepening. It appears that this recipe performs the same sequence of node expansions (that is, calls to children) as Korf's depth-first iterative deepening algorithm:
However, the recursion structure of the recipe is upside down compared to DFID: in DFID, we have a call chain from the root of the tree to each generated node, while here the call chain goes in the opposite direction. This inverted structure makes it possible to output the visited nodes as a simple generator stream, rather than passing them to a callback or otherwise processing them within the recursive visit. Also, I think the code in the recipe is much simpler than that for DFID, especially if T is infinite (as in most DFID applications) which would allow one to omit the recipe lines involving variable last.
What kind of trees? Hi, I'm a python newbie and my head is spinning a bit from this recursive generator, so please forgive my naivety. I have some questions based upon a couple examples I've tried:
Should there be a try/except in the code to catch the case where a terminal element is not iterable?
--- Example 2 ---
why is '2' output twice? Is this by design? e.g. also in the case of "tree = ['0','1']", '0' is output twice.
why aren't the strings '00' and '01' iterated through, since they are iterable? Is this algorithm only for trees of a uniform depth?
To answer the last question first, '2' is output twice because, when you iterate through the items in a string such as '2', you get the characters in that string, and this string has one character in it, namely '2' again. Apparently the code then detects the infinite recursion and stops.
Anyway, re "what kind of tree": the child function needs to be callable on all the nodes of the tree, including any leaves it might have. One somewhat trivial example would be a nested list of lists, without any other kind of object in it: [[]].
Or, if you want to mix lists and other kinds of objects, you could use a child function something like this:
Here's another possible child function, which will do a breadth first traversal of a filesystem hierarchy:
Thanks! Ah, I see... it didn't occur to me to play with the "children" iterator to avoid the exception or redundant output. Now, of course, it seems obvious. Your usage examples helped me a lot. Thanks very much! :)
Not a valid breadth-first search. I believe a bread-first search is normally understood to be non-backtracking, and that this code does not perform a valid breadth-first search.
The problem is that the 'last' variable only prevents backtracking to the last visited node, not to any previously visited node. But if we are searching even a finite network with loops, we must beware of the latter condition. For instance, consider the network of only three points A, B, and C that form a triangle:
Now consider the resulting iterator object:
This is an infinite loop. Another error: the standard queue algorithm has O(h) space complexity (where h is the number of nodes), not something more as is stated above.
Here is an implementation of a breadth-first search, which also returns an iterator object as above, but which uses the standard queue algorithm:
This function returns an iterator object that will terminate:
By the way, since bfs() calls set() without importing sets, it requires Python2.4.