ActiveState Code

Recipe 52258: a Set class for python


A Set is a collection of items containing no duplicates.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
class Set:
    """A Set is a collection of items with no particular ordering,
    containing no dulicates."""
    def __init__(self, *args):
        self._dict = {}
        for arg in args:
            self.add(arg)

    def __repr__(self):
        import string
        return "%s(%s)" % (self.__class__.__name__,
                           string.join(map(repr, self._dict.values()), ', '))

    def extend(self, args):
        """Add several items at once."""
        for arg in args:
            self.add(arg)

    def add(self, item):
        """Add one item to the set."""
        self._dict[item] = item

    def remove(self, item):
        """Remove an item from the set."""
        del self._dict[item]

    def contains(self, item):
        """Check whether the set contains a certain item."""
        return self._dict.has_key(item)

    # Higher performance member-test for python 2.0 and above
    __contains__ = contains

    def __getitem__(self, index):
        """Support the 'for item in set:' protocol."""
        return self._dict.keys()[index]

    def __len__(self):
        """Return the number of items in the set."""
        return len(self._dict)

    def items(self):
        """Return a list containing all items."""
        return self._dict.keys()

# class Set()

Comments

  1. 1. At 6:05 p.m. on 15 mar 2001, Michael Strasser said:

    Typo in __getitem__ definition. You accidentally forgot to close the doc string's triple quote:

    """Support the 'for item in set:' protocol."
    

    instead of:

    """Support the 'for item in set:' protocol."""
    

    Michael Strasser

  2. 2. At 7:56 a.m. on 16 mar 2001, Jay O'Connor said:

    Close to a Bag. Since you are using dictionary keys to maintain uniqueness, you can add a value that would be a count of how many 'occurances' of a given object have been added to the Set, which would make it a Bag Jay O'Connor

  3. 3. At 5:12 a.m. on 18 mar 2001, Vivian De Smedt said:

    union, substraction and intersection. Maybe it could be usefull to overload the operator (+, -, |, &, ^) to performe the classical set operations.

    Here is a naive version of these operators.

    def __add__(self, set):
            """Return the union of the set and an another set."""
            result = Set()
            for i in self:
                result.add(i)
            for i in set:
                result.add(i)
            return result
    
        def __sub__(self, set):
            """Return the difference between the set and an another set."""
            result = Set()
            for i in self:
                result.add(i)
            for i in set:
                if i in result:
                    result.remove(i)
            return result
    
        def __and__(self, set):
            """Return the intersection of the set and an another set."""
            if len(self) < len(set):
                a = self
                b = set
            else:
                a = set
                b = self
            result = Set()
            for i in a:
                if i in b:
                    result.add(i)
            return result
    
        __or__ = __add__
    
        def __xor__(self, set):
            """Return the union of the set and an another set but their intersection."""
            result = Set()
            for i in self:
                result.add(i)
            for i in set:
                if i in result:
                    result.remove(i)
                else:
                    result.add(i)
            return result
    

    Vivian De Smedt

  4. 4. At 11:02 a.m. on 26 mar 2001, Alex Martelli said:

    the contain method MUST use .has_key, NOT the in operator. The contain method must check self.dict.has_key(whatever) [which is O(1)], NOT whatever in self.dict.keys() [which is O(N)...]. Alex Martelli

  5. 5. At 11:42 a.m. on 27 mar 2001, Thomas Heller (the author) said:

    Thanks, Alex, for the tip. Thomas Heller

Sign in to comment