Set that remembers original insertion order.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 | import collections
class OrderedSet(collections.MutableSet):
KEY, PREV, NEXT = range(3)
def __init__(self, iterable=None):
self.end = end = []
end += [None, end, end] # sentinel node for doubly linked list
self.map = {} # key --> [key, prev, next]
if iterable is not None:
self |= iterable
def __len__(self):
return len(self.map)
def __contains__(self, key):
return key in self.map
def add(self, key):
if key not in self.map:
end = self.end
curr = end[self.PREV]
curr[self.NEXT] = end[self.PREV] = self.map[key] = [key, curr, end]
def discard(self, key):
if key in self.map:
key, prev, next = self.map.pop(key)
prev[self.NEXT] = next
next[self.PREV] = prev
def __iter__(self):
end = self.end
curr = end[self.NEXT]
while curr is not end:
yield curr[self.KEY]
curr = curr[self.NEXT]
def __reversed__(self):
end = self.end
curr = end[self.PREV]
while curr is not end:
yield curr[self.KEY]
curr = curr[self.PREV]
def pop(self, last=True):
if not self:
raise KeyError('set is empty')
key = next(reversed(self)) if last else next(iter(self))
self.discard(key)
return key
def __repr__(self):
if not self:
return '%s()' % (self.__class__.__name__,)
return '%s(%r)' % (self.__class__.__name__, list(self))
def __eq__(self, other):
if isinstance(other, OrderedSet):
return len(self) == len(other) and list(self) == list(other)
return set(self) == set(other)
def __del__(self):
self.clear() # remove circular references
if __name__ == '__main__':
print(OrderedSet('abracadaba'))
print(OrderedSet('simsalabim'))
x = OrderedSet('abracadaba')
# doesn't raise "Exception TypeError: TypeError('list indices must be integers, not NoneType',) in ignored"
|
Runs on Py2.6 or later (and runs on 3.0 or later without any modifications).
Implementation based on a doubly linked link and an internal dictionary. This design gives OrderedSet the same big-Oh running times as regular sets including O(1) adds, removes, and lookups as well as O(n) iteration.
Edit #1: moved constants inside class to avoid raising TypeError noticed by j vickroy
Thanks for the code.
Adding an assignment statement to the main program:
seems to trigger an exception:
Exception TypeError: TypeError('list indices must be integers, not NoneType',) in ignored
Any idea what my mistake is?
I'm running Python 2.6.4 on a MS Windows XP machine.
Hi j vickroy,
This recipe is a fork of recipe 576694 by Raymond Hettinger, he would know better than I. However, it seems that the error occurs when the module finishes rather than at the point you assign to x. Put a "print 'got here'" at the very end to see what I mean. I tracked the problem down to the "KEY, PREV, NEXT = range(3)" declared outside the OrderedSet class. Something is referencing one of these integers after the module has been destroyed and they have been set to None. Moving them inside the class and using "self.PREV", etc. fixes the problem. The reason you don't see this when importing and using OrderedSet is that the module gets destroyed after any of your modules that import it.
I have updated the recipe according to the previous comment. See diff here
Thanks much Jonathan for your quick and detailed response and for taking the time to further diagnose this. I like your solution which seems cleaner and safer to me.