Welcome, guest | Sign In | My Account | Store | Cart

Set that remembers original insertion order.

Python, 71 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71``` ```import collections class OrderedSet(collections.MutableSet): KEY, PREV, NEXT = range(3) def __init__(self, iterable=None): self.end = end = [] end += [None, end, end] # sentinel node for doubly linked list self.map = {} # key --> [key, prev, next] if iterable is not None: self |= iterable def __len__(self): return len(self.map) def __contains__(self, key): return key in self.map def add(self, key): if key not in self.map: end = self.end curr = end[self.PREV] curr[self.NEXT] = end[self.PREV] = self.map[key] = [key, curr, end] def discard(self, key): if key in self.map: key, prev, next = self.map.pop(key) prev[self.NEXT] = next next[self.PREV] = prev def __iter__(self): end = self.end curr = end[self.NEXT] while curr is not end: yield curr[self.KEY] curr = curr[self.NEXT] def __reversed__(self): end = self.end curr = end[self.PREV] while curr is not end: yield curr[self.KEY] curr = curr[self.PREV] def pop(self, last=True): if not self: raise KeyError('set is empty') key = next(reversed(self)) if last else next(iter(self)) self.discard(key) return key def __repr__(self): if not self: return '%s()' % (self.__class__.__name__,) return '%s(%r)' % (self.__class__.__name__, list(self)) def __eq__(self, other): if isinstance(other, OrderedSet): return len(self) == len(other) and list(self) == list(other) return set(self) == set(other) def __del__(self): self.clear() # remove circular references if __name__ == '__main__': print(OrderedSet('abracadaba')) print(OrderedSet('simsalabim')) x = OrderedSet('abracadaba') # doesn't raise "Exception TypeError: TypeError('list indices must be integers, not NoneType',) in ignored" ```

Runs on Py2.6 or later (and runs on 3.0 or later without any modifications).

Implementation based on a doubly linked link and an internal dictionary. This design gives OrderedSet the same big-Oh running times as regular sets including O(1) adds, removes, and lookups as well as O(n) iteration.

Edit #1: moved constants inside class to avoid raising TypeError noticed by j vickroy

j vickroy 10 years, 5 months ago

Thanks for the code.

Adding an assignment statement to the main program:

``````x = OrderedSet('abracadaba')
``````

seems to trigger an exception:

Exception TypeError: TypeError('list indices must be integers, not NoneType',) in ignored

Any idea what my mistake is?

I'm running Python 2.6.4 on a MS Windows XP machine.

Jonathan Blakes (author) 10 years, 5 months ago

Hi j vickroy,

This recipe is a fork of recipe 576694 by Raymond Hettinger, he would know better than I. However, it seems that the error occurs when the module finishes rather than at the point you assign to x. Put a "print 'got here'" at the very end to see what I mean. I tracked the problem down to the "KEY, PREV, NEXT = range(3)" declared outside the OrderedSet class. Something is referencing one of these integers after the module has been destroyed and they have been set to None. Moving them inside the class and using "self.PREV", etc. fixes the problem. The reason you don't see this when importing and using OrderedSet is that the module gets destroyed after any of your modules that import it.

Jonathan Blakes (author) 10 years, 5 months ago

I have updated the recipe according to the previous comment. See diff here

j vickroy 10 years, 5 months ago

Thanks much Jonathan for your quick and detailed response and for taking the time to further diagnose this. I like your solution which seems cleaner and safer to me.

 Created by Jonathan Blakes on Thu, 24 Mar 2011 (MIT)