Guido illustrated the DFS recursive traversal of a graph (http://www.python.org/doc/essays/graphs.html) However if the graph is too big, recursion error occurs.
Here im pitching in my recipe for an iterative BFS traversal.
Im using Guido's graph representation using dictionary.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | # a sample graph
graph = {'A': ['B', 'C','E'],
'B': ['A','C', 'D'],
'C': ['D'],
'D': ['C'],
'E': ['F','D'],
'F': ['C']}
class MyQUEUE: # just an implementation of a queue
def __init__(self):
self.holder = []
def enqueue(self,val):
self.holder.append(val)
def dequeue(self):
val = None
try:
val = self.holder[0]
if len(self.holder) == 1:
self.holder = []
else:
self.holder = self.holder[1:]
except:
pass
return val
def IsEmpty(self):
result = False
if len(self.holder) == 0:
result = True
return result
path_queue = MyQUEUE() # now we make a queue
def BFS(graph,start,end,q):
temp_path = [start]
q.enqueue(temp_path)
while q.IsEmpty() == False:
tmp_path = q.dequeue()
last_node = tmp_path[len(tmp_path)-1]
print tmp_path
if last_node == end:
print "VALID_PATH : ",tmp_path
for link_node in graph[last_node]:
if link_node not in tmp_path:
new_path = []
new_path = tmp_path + [link_node]
q.enqueue(new_path)
BFS(graph,"A","D",path_queue)
-------------results-------------------
['A']
['A', 'B']
['A', 'C']
['A', 'E']
['A', 'B', 'C']
['A', 'B', 'D']
VALID_PATH : ['A', 'B', 'D']
['A', 'C', 'D']
VALID_PATH : ['A', 'C', 'D']
['A', 'E', 'F']
['A', 'E', 'D']
VALID_PATH : ['A', 'E', 'D']
['A', 'B', 'C', 'D']
VALID_PATH : ['A', 'B', 'C', 'D']
['A', 'E', 'F', 'C']
['A', 'E', 'F', 'C', 'D']
VALID_PATH : ['A', 'E', 'F', 'C', 'D']
|
I think you should use collections.deque rather than writing your own queue class. Plus, a search algorithm should not visit nodes more than once.
I think the Pythonic way of implementing visits should be a generator. To give accessing methods enough information to do useful things, every time I visit a node I return its parent. Here is my own implementation of BFS and DFS, with a sample implementation of the shortest_path function.
For a complete Python graph library, I advise you to check NetworkX: http://networkx.lanl.gov/ .
More compact implementation of the shortest_path function:
To "Matteo Dell'Amico":
"Plus, a search algorithm should not visit nodes more than once"
You are wrong,- algorithm should not visit nodes more than once in one PATH. So it is allowable to visit node several times in different A-D routes. So mojave kid implementation of BFS is correct.
"More compact implementation of the shortest_path function"
I think this is redundant information for breadth first search algorithm, because it strongly depends on goal - what you want to find out from search. If you want to find just shortest route from A to D,- than OK, your suggestions is good. But what if I want to find ALL routes from A to D ? In that case your suggestion is meaningless and mojave kid implementation - is good for such problem. So I suggest, lets leave concrete implementation of BFS algorithm for the script user...
Agnius: http://mathworld.wolfram.com/Breadth-FirstTraversal.html (and all other references I can find) explain clearly that a BFS shouldn't visit nodes more than once. Of course, the shortest_path function does what its name says: it finds the shortest path. Generating all paths is very different, and can easily be computationally prohibitive.
So has this (all shortest paths between 2 nodes) been included in the networkx's functions?
btw, thank you for this implementation. Works just fine.
This doesn't work for a MultiGraph() consisting of 2 different edges between 2 adjacent nodes. What needs to be done then?