This data structure acts almost like a dictionary, with two modifications: First, D.smallest() returns the value x minimizing D[x]. For this to work correctly, all values D[x] stored in the dictionary must be comparable. Second, iterating "for x in D" finds and removes the items from D in sorted order. Each item is not removed until the next item is requested, so D[x] will still return a useful value until the next iteration of the for-loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
# Priority dictionary using binary heaps # David Eppstein, UC Irvine, 8 Mar 2002 from __future__ import generators class priorityDictionary(dict): def __init__(self): '''Initialize priorityDictionary by creating binary heap of pairs (value,key). Note that changing or removing a dict entry will not remove the old pair from the heap until it is found by smallest() or until the heap is rebuilt.''' self.__heap =  dict.__init__(self) def smallest(self): '''Find smallest item after removing deleted items from heap.''' if len(self) == 0: raise IndexError, "smallest of empty priorityDictionary" heap = self.__heap while heap not in self or self[heap] != heap: lastItem = heap.pop() insertionPoint = 0 while 1: smallChild = 2*insertionPoint+1 if smallChild+1 < len(heap) and \ heap[smallChild] > heap[smallChild+1]: smallChild += 1 if smallChild >= len(heap) or lastItem <= heap[smallChild]: heap[insertionPoint] = lastItem break heap[insertionPoint] = heap[smallChild] insertionPoint = smallChild return heap def __iter__(self): '''Create destructive sorted iterator of priorityDictionary.''' def iterfn(): while len(self) > 0: x = self.smallest() yield x del self[x] return iterfn() def __setitem__(self,key,val): '''Change value stored in dictionary and add corresponding pair to heap. Rebuilds the heap if the number of deleted items grows too large, to avoid memory leakage.''' dict.__setitem__(self,key,val) heap = self.__heap if len(heap) > 2 * len(self): self.__heap = [(v,k) for k,v in self.iteritems()] self.__heap.sort() # builtin sort likely faster than O(n) heapify else: newPair = (val,key) insertionPoint = len(heap) heap.append(None) while insertionPoint > 0 and \ newPair < heap[(insertionPoint-1)//2]: heap[insertionPoint] = heap[(insertionPoint-1)//2] insertionPoint = (insertionPoint-1)//2 heap[insertionPoint] = newPair def setdefault(self,key,val): '''Reimplement setdefault to call our customized __setitem__.''' if key not in self: self[key] = val return self[key]
This type of data structure is normally called a priority queue, and there are a couple of priority queue recipes already in the cookbook, but I feel that a dictionary is a better metaphor for the priority-change operations needed in typical applications of priority queues such as Dijkstra's algorithm.
The implementation here uses binary heaps, taking logarithmic amortized time per operation -- the time is amortized not worst-case because when a heap value becomes invalid, we leave it there to be cleaned up by a later call to smallest(), so a single call can do a lot of cleanup. The other recipes I saw used either sequential search (linear time find-min) or insertion into a sorted list (linear time updates). I tested a simpler variant of this data structure using sequential search instead of the heap, and found that the heap is better for heap sizes of 60 items or more.