Welcome, guest | Sign In | My Account | Store | Cart

Uses a recursively called simple generator (with the same arguments as the outer call!) to traverse a tree in breadth first order.

Python, 14 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def breadth_first(tree,children=iter):
    """Traverse the nodes of a tree in breadth-first order.
    The first argument should be the tree root; children
    should be a function taking as argument a tree node and
    returning an iterator of the node's children.
    """
    yield tree
    last = tree
    for node in breadth_first(tree,children):
        for child in children(node):
            yield child
            last = child
        if last == node:
            return

This is an example of the self-recursive generators discussed in http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/221457

A little care is needed (the lines containing variable "last") to avoid getting into an infinite recursion for trees that are not themselves infinite. The recursive call is always traversing one level higher than the outer call, so if the two calls ever reach a situation where both have the same most-recently-output node, we must have traversed the whole tree. If the tree has height h, nodes at distance d from the root are traversed by h-d instances of the generator. When the number of nodes grows by at least a constant factor in each level (e.g. complete binary trees) it takes only constant time per tree node on average. Unlike the usual queue-based BFS, the space used is only O(h).

5 comments

David Eppstein (author) 20 years, 4 months ago  # | flag

Iterative deepening. It appears that this recipe performs the same sequence of node expansions (that is, calls to children) as Korf's depth-first iterative deepening algorithm:

def dfid(T,children,callback):
    def visit(node,i):
        if i == 0:
            callback(node)
        else:
            for c in children(node):
                visit(c,i-1)
    i = 0
    while 1:
        visit(T,i)
        i += 1

However, the recursion structure of the recipe is upside down compared to DFID: in DFID, we have a call chain from the root of the tree to each generated node, while here the call chain goes in the opposite direction. This inverted structure makes it possible to output the visited nodes as a simple generator stream, rather than passing them to a callback or otherwise processing them within the recursive visit. Also, I think the code in the recipe is much simpler than that for DFID, especially if T is infinite (as in most DFID applications) which would allow one to omit the recipe lines involving variable last.

Kevin Edwards 20 years, 4 months ago  # | flag

What kind of trees? Hi, I'm a python newbie and my head is spinning a bit from this recursive generator, so please forgive my naivety. I have some questions based upon a couple examples I've tried:

# --- Example 1 ---
>>> tree = [0]
>>> for x in breadth_first(tree):
...  print x
...
[0]
0
Traceback (most recent call last):
  File "", line 1, in ?
  File "", line 10, in breadth_first
TypeError: iteration over non-sequence
# -----------------
  • Should there be a try/except in the code to catch the case where a terminal element is not iterable?

    --- Example 2 ---

    >>> tree = [['00'],['10'],'2']
    >>> for x in breadth_first(tree):
    ...   print x
    ...
    [['00'], ['10'], '2']
    ['00']
    ['10']
    2
    00
    10
    2
    <h4 id="-">-----------------</h4>
    
  • why is '2' output twice? Is this by design? e.g. also in the case of "tree = ['0','1']", '0' is output twice.

  • why aren't the strings '00' and '01' iterated through, since they are iterable? Is this algorithm only for trees of a uniform depth?

David Eppstein (author) 20 years, 4 months ago  # | flag

To answer the last question first, '2' is output twice because, when you iterate through the items in a string such as '2', you get the characters in that string, and this string has one character in it, namely '2' again. Apparently the code then detects the infinite recursion and stops.

Anyway, re "what kind of tree": the child function needs to be callable on all the nodes of the tree, including any leaves it might have. One somewhat trivial example would be a nested list of lists, without any other kind of object in it: [[[][]][]].

Or, if you want to mix lists and other kinds of objects, you could use a child function something like this:

def children(node):
    if isinstance(node,list):
        return iter(node)
    else:
        return []

Here's another possible child function, which will do a breadth first traversal of a filesystem hierarchy:

def children(path):
    if os.path.isdir(path):
        for filename in os.listdir(path):
            yield os.path.join(path,filename)
Kevin Edwards 20 years, 4 months ago  # | flag

Thanks! Ah, I see... it didn't occur to me to play with the "children" iterator to avoid the exception or redundant output. Now, of course, it seems obvious. Your usage examples helped me a lot. Thanks very much! :)

alexis gallagher 17 years, 8 months ago  # | flag

Not a valid breadth-first search. I believe a bread-first search is normally understood to be non-backtracking, and that this code does not perform a valid breadth-first search.

The problem is that the 'last' variable only prevents backtracking to the last visited node, not to any previously visited node. But if we are searching even a finite network with loops, we must beware of the latter condition. For instance, consider the network of only three points A, B, and C that form a triangle:

def children(ch):
    if ch == 'A': return ['B','C']
    elif ch == 'B': return ['A','C']
    elif ch == 'C': return ['A','B']
    else: return

Now consider the resulting iterator object:

>>> s = breadth_first('A',children)

>>> s.next()
'A'

>>> s.next()
'B'

>>> s.next()
'C'

>>> s.next()
'A'

This is an infinite loop. Another error: the standard queue algorithm has O(h) space complexity (where h is the number of nodes), not something more as is stated above.

Here is an implementation of a breadth-first search, which also returns an iterator object as above, but which uses the standard queue algorithm:

def bfs(root,visitable,children=iter):
    """Iterator traverses a tree in breadth-first order.

    The first argument should be the tree root; visitable should be an
    iterable with all searchable nodes; children should be a function
    which takes a node and return an iterator of the node's children.
    """
    queue = []
    # makes a shallow copy, makes it a collection, removes duplicates
    unvisited = list(set(visitable))

    if root in unvisited:
        unvisited.remove(root)
        queue.append(root)

    while len(queue) > 0:
        node = queue.pop(0)
        yield node

        for child in children(node):
            if child in unvisited:
                unvisited.remove(child)
                queue.append(child)
    return

This function returns an iterator object that will terminate:

>>> s  = bfs('A',['A','B','C'], children)
>>> s.next()
'A'
>>> s.next()
'B'
>>> s.next()
'C'
>>> s.next()
--------------
exceptions.StopIteration

By the way, since bfs() calls set() without importing sets, it requires Python2.4.

Created by David Eppstein on Fri, 31 Oct 2003 (PSF)
Python recipes (4591)
David Eppstein's recipes (14)

Required Modules

  • (none specified)

Other Information and Tasks