Alternate version of OrderedSet (recipe 576694) using weak references instead of __del__.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | import collections
from weakref import proxy
class Link(object):
    __slots__ = 'prev', 'next', 'key', '__weakref__'
class OrderedSet(collections.MutableSet):
    'Set the remembers the order elements were added'
    # Big-O running times for all methods are the same as for regular sets.
    # The internal self.__map dictionary maps keys to links in a doubly linked list.
    # The circular doubly linked list starts and ends with a sentinel element.
    # The sentinel element never gets deleted (this simplifies the algorithm).
    # The prev/next links are weakref proxies (to prevent circular references).
    # Individual links are kept alive by the hard reference in self.__map.
    # Those hard references disappear when a key is deleted from an OrderedSet.
    def __init__(self, iterable=None):
        self.__root = root = Link()         # sentinel node for doubly linked list
        root.prev = root.next = root
        self.__map = {}                     # key --> link
        if iterable is not None:
            self |= iterable
    def __len__(self):
        return len(self.__map)
    def __contains__(self, key):
        return key in self.__map
    def add(self, key):
        # Store new key in a new link at the end of the linked list
        if key not in self.__map:
            self.__map[key] = link = Link()            
            root = self.__root
            last = root.prev
            link.prev, link.next, link.key = last, root, key
            last.next = root.prev = proxy(link)
    def discard(self, key):
        # Remove an existing item using self.__map to find the link which is
        # then removed by updating the links in the predecessor and successors.        
        if key in self.__map:        
            link = self.__map.pop(key)
            link.prev.next = link.next
            link.next.prev = link.prev
    def __iter__(self):
        # Traverse the linked list in order.
        root = self.__root
        curr = root.next
        while curr is not root:
            yield curr.key
            curr = curr.next
    def __reversed__(self):
        # Traverse the linked list in reverse order.
        root = self.__root
        curr = root.prev
        while curr is not root:
            yield curr.key
            curr = curr.prev
    def pop(self, last=True):
        if not self:
            raise KeyError('set is empty')
        key = next(reversed(self)) if last else next(iter(self))
        self.discard(key)
        return key
    def __repr__(self):
        if not self:
            return '%s()' % (self.__class__.__name__,)
        return '%s(%r)' % (self.__class__.__name__, list(self))
    def __eq__(self, other):
        if isinstance(other, OrderedSet):
            return len(self) == len(other) and list(self) == list(other)
        return not self.isdisjoint(other)
 | 
Prevents circular reference issues by using weak references between links.
Works on Python 2.4 or later.

 Download
Download Copy to clipboard
Copy to clipboard
There's a bug in the last line of the __eq__ method. It means that OrderedSet([1, 2, 3]) compares equal to set([1, 2, 3, 4]) and set([1, 2]) because they overlap. One way to fix it is to make the last line
It's a bit long-winded - is there a simpler way to implement the comparison for this case? I was thinking of using
but I'm put off by the thought that it could create a large set as the result of the xor (if the sets are large and very different) before throwing it away. At least the first one only loops through the two sets, stopping as soon as there's a difference. I'm conscious of the fact that the 'in' check will be expensive with a long list, though.
I think
should do the job.
No,
not self.isdisjoint(other)will only be true if the lists do not share a single element. So with Paul's version, OrderedSet([1, 2, 3]) is not equal to set([1, 2]) any more, but still to set([9, 8, 1]). Certainly not what is intended.xtian's version gets it right for sets, but what about lists? Shouldn't
OrderedSet([3,2,1]) == [1,2,3]be false, because the ordering matters for both the ordered set and the list?To be safe, I am now using
Thank you Raymond for a great piece of code!
I rewrote the __eq__ method body to the following:
return isinstance(other, OrderedSet) and list(self) == list(other)This ensures that the relation is symmetric (except for classes that inherits from this class and overrides the __eq__ method, to be protected from that you need a can_equal method). Also, we get rid of the redundant
len(self) == len(other)check and it fits on one line while still being readable (IMO).I also added some documentation that might be helpful. I'm sorry that the formatting got all messed up...
MutableSet is not part of collections in Python 2.4 (it was added in 2.7?) and updating a set with
|=will fail if the rhs is a list. So__init__can use.update(iterable)instead.For that
__eq__test how about,This avoids creating the entire lists.