Welcome, guest | Sign In | My Account | Store | Cart

Sets of only a few items already have many ways they can be partitioned into subsets. Therefore it can be useful to generate these partitions by index, like the partition class were some large list where one can just access element "i". Of course one should not compute the whole list in advance but compute the partitions on the fly. This recipe was originally extracted form a book by Kreher and Stinson. Over the years I came back to my code from time to time creating a new and hopefully more pythonic version each time I understood it better. My current take on it is that the algorithm looks a lot like creating a Pascals triangle in order to compute combinations. One just tries to find a way down the triangle to a specific element, each time subtracting the amounts the different positions in the triangle account for. It is also similar to finding indexed permutations of a set with elements occurring more than once. One of these days I will perhaps understand how all of this fits together. Until then I'll post code solving specific situations.

Python, 66 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
from collections import defaultdict

class Partition:

    def __init__(self, S):
        self.data = list(S)
        self.m = len(S)
        self.table = self.rgf_table()

    def __getitem__(self, i):
        #generates set partitions by index
        if i > len(self) - 1:
             raise IndexError
        L =  self.unrank_rgf(i)
        result = self.as_set_partition(L)
        return result
    
    def __len__(self):
        return self.table[self.m,0]

    def as_set_partition(self, L):
        # Transform a restricted growth function into a partition
        n = max(L[1:]+[1])
        m = self.m
        data = self.data
        P = [[] for _ in range(n)]
        for i in range(m):
            P[L[i+1]-1].append(data[i])
        return P

    def rgf_table(self):
        # Compute the table values 
        m = self.m
        D = defaultdict(lambda:1)
        for i in range(1,m+1):
            for j in range(0,m-i+1):
                D[i,j] = j * D[i-1,j] + D[i-1,j+1]
        return D

    def unrank_rgf(self, r):
        # Unrank a restricted growth function
        m = self.m
        L = [1 for _ in range(m+1)]
        j = 1
        D = self.table
        for i in range(2,m+1):
            v = D[m-i,j]
            cr = j*v
            if cr <= r:
                L[i] = j + 1
                r -= cr
                j += 1
            else:
                L[i] = r / v + 1
                r  %= v
        return L

def test():
    S = set(range(4))
    P = Partition(S)
    print len(P)
    for x in P:
         print x

if __name__ == '__main__':
    test() 

A long time ago I posted something like this on comp.lang.python, expecting people to help advance this recipe as it was in a terrible state. I also promised to update it myself but forgot. Then I was attending an IRC session not too long ago where people specifically asked for this recipe but I was too ashamed to offer my version. Also, in all these years there seem to have been no people -- as far as my google fu goes -- to post a set partition by index recipe.

3 comments

Daniel Lepage 13 years, 11 months ago  # | flag

Is this faster that computing partition points directly?

def sep_pts(l, n):
    '''Picks the nth set of partition points for l'''
    return [i+1 for i in range(0,len(l)-1) if n & (1 << i)]

def do_partition(seps, l):
    '''Partitions l based on the partition points in seps'''
    seps = [0] + seps + [None]
    return [l[seps[i]:seps[i+1]] for i in range(len(seps)-1)]

def partition(l, n):
    '''Returns the nth partition of l'''
    return do_partition(sep_pts(l,n), l)

if __name__ == '__main__':
    l = list('abcdefghijklmnop')
    print partition(l,0) # one segment
    print partition(l,1)
    print partition(l,2)
    print partition(l,65535) # each letter is its own segment
    print partition(l,30000)
Anton Vredegoor (author) 13 years, 11 months ago  # | flag

@Daniel Lepage:

the function you posted skips sets of elements that are not consecutive?

chris haulk 12 years, 7 months ago  # | flag

Some obvious comments that someone should make:

The key line in the algorithm is D[i,j] = j * D[i-1,j] + D[i-1,j+1] which identifies D[i,j] as the number of ways to add i new items to a partition that already has j blocks. So e.g. D[1, j] = j+1 because a new item can be added to a partition having j blocks by adding it to any existing block or by having it start its own new block. Thus D[i,0] is the number of ways to partition the set {1, ..., i}.

My current take on it is that the algorithm looks a lot like creating a Pascals triangle in order to compute combinations.

To solve this problem for combinations (i.e. get the i-th subset of a set), just take the binary representation of the index i, right? The places where 1's occur tell you which items to keep -- no need to generate Pascal's triangle.