Alternate version of OrderedSet (recipe 576694) using weak references instead of __del__.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | import collections
from weakref import proxy
class Link(object):
__slots__ = 'prev', 'next', 'key', '__weakref__'
class OrderedSet(collections.MutableSet):
'Set the remembers the order elements were added'
# Big-O running times for all methods are the same as for regular sets.
# The internal self.__map dictionary maps keys to links in a doubly linked list.
# The circular doubly linked list starts and ends with a sentinel element.
# The sentinel element never gets deleted (this simplifies the algorithm).
# The prev/next links are weakref proxies (to prevent circular references).
# Individual links are kept alive by the hard reference in self.__map.
# Those hard references disappear when a key is deleted from an OrderedSet.
def __init__(self, iterable=None):
self.__root = root = Link() # sentinel node for doubly linked list
root.prev = root.next = root
self.__map = {} # key --> link
if iterable is not None:
self |= iterable
def __len__(self):
return len(self.__map)
def __contains__(self, key):
return key in self.__map
def add(self, key):
# Store new key in a new link at the end of the linked list
if key not in self.__map:
self.__map[key] = link = Link()
root = self.__root
last = root.prev
link.prev, link.next, link.key = last, root, key
last.next = root.prev = proxy(link)
def discard(self, key):
# Remove an existing item using self.__map to find the link which is
# then removed by updating the links in the predecessor and successors.
if key in self.__map:
link = self.__map.pop(key)
link.prev.next = link.next
link.next.prev = link.prev
def __iter__(self):
# Traverse the linked list in order.
root = self.__root
curr = root.next
while curr is not root:
yield curr.key
curr = curr.next
def __reversed__(self):
# Traverse the linked list in reverse order.
root = self.__root
curr = root.prev
while curr is not root:
yield curr.key
curr = curr.prev
def pop(self, last=True):
if not self:
raise KeyError('set is empty')
key = next(reversed(self)) if last else next(iter(self))
self.discard(key)
return key
def __repr__(self):
if not self:
return '%s()' % (self.__class__.__name__,)
return '%s(%r)' % (self.__class__.__name__, list(self))
def __eq__(self, other):
if isinstance(other, OrderedSet):
return len(self) == len(other) and list(self) == list(other)
return not self.isdisjoint(other)
|
Prevents circular reference issues by using weak references between links.
Works on Python 2.4 or later.
There's a bug in the last line of the __eq__ method. It means that OrderedSet([1, 2, 3]) compares equal to set([1, 2, 3, 4]) and set([1, 2]) because they overlap. One way to fix it is to make the last line
It's a bit long-winded - is there a simpler way to implement the comparison for this case? I was thinking of using
but I'm put off by the thought that it could create a large set as the result of the xor (if the sets are large and very different) before throwing it away. At least the first one only loops through the two sets, stopping as soon as there's a difference. I'm conscious of the fact that the 'in' check will be expensive with a long list, though.
I think
should do the job.
No,
not self.isdisjoint(other)
will only be true if the lists do not share a single element. So with Paul's version, OrderedSet([1, 2, 3]) is not equal to set([1, 2]) any more, but still to set([9, 8, 1]). Certainly not what is intended.xtian's version gets it right for sets, but what about lists? Shouldn't
OrderedSet([3,2,1]) == [1,2,3]
be false, because the ordering matters for both the ordered set and the list?To be safe, I am now using
Thank you Raymond for a great piece of code!
I rewrote the __eq__ method body to the following:
return isinstance(other, OrderedSet) and list(self) == list(other)
This ensures that the relation is symmetric (except for classes that inherits from this class and overrides the __eq__ method, to be protected from that you need a can_equal method). Also, we get rid of the redundant
len(self) == len(other)
check and it fits on one line while still being readable (IMO).I also added some documentation that might be helpful. I'm sorry that the formatting got all messed up...
MutableSet is not part of collections in Python 2.4 (it was added in 2.7?) and updating a set with
|=
will fail if the rhs is a list. So__init__
can use.update(iterable)
instead.For that
__eq__
test how about,This avoids creating the entire lists.