Welcome, guest | Sign In | My Account | Store | Cart

The function combine takes multiple sequences and creates a list in which each item is constructed from items from each input sequence, and all possible combinations are created. If that description is confusing, look at the example in the docstring. It's a pretty simple transformation. The function xcombine is similar, but returns a generator rather than creating the output all at once.

Python, 34 lines
def combine(*seqin):
    '''returns a list of all combinations of argument sequences.
for example: combine((1,2),(3,4)) returns
[[1, 3], [1, 4], [2, 3], [2, 4]]'''
    def rloop(seqin,listout,comb):
        '''recursive looping function'''
        if seqin:                       # any more sequences to process?
            for item in seqin[0]:
                newcomb=comb+[item]     # add next item to current comb
                # call rloop w/ rem seqs, newcomb
        else:                           # processing last sequence
            listout.append(comb)        # comb finished, add to list
    listout=[]                      # listout initialization
    rloop(seqin,listout,[])         # start recursive process
    return listout

def xcombine(*seqin):
    '''returns a generator which returns combinations of argument sequences
for example xcombine((1,2),(3,4)) returns a generator; calling the next()
method on the generator will return [1,3], [1,4], [2,3], [2,4] and
StopIteration exception.  This will not create the whole list of 
combinations in memory at once.'''
    def rloop(seqin,comb):
        '''recursive looping function'''
        if seqin:                   # any more sequences to process?
            for item in seqin[0]:
                newcomb=comb+[item]     # add next item to current combination
                # call rloop w/ remaining seqs, newcomb
                for item in rloop(seqin[1:],newcomb):   
                    yield item          # seqs and newcomb
        else:                           # processing last sequence
            yield comb                  # comb finished, add to list
    return rloop(seqin,[])

A trivial example would be to find the distribution of the rolls of two die:

range(1,7) nicely lists the possible values of a single roll.

>>> dist={}
>>> for roll in [sum(die_vals) for die_vals in combine(range(1,7),range(1,7))]:
...     dist[roll]=dist.setdefault(roll,0)+1
>>> dist
{2: 1, 3: 2, 4: 3, 5: 4, 6: 5, 7: 6, 8: 5, 9: 4, 10: 3, 11: 2, 12: 1}

A more substantial usage would be in repeating a calculation or simulation with input parameters varied over tolerance ranges:

>>> def Vdiv(Vin, Rupper, Rlower):
...     '''returns output voltage of a voltage divider'''
...     return Rlower/(Rupper+Rlower)*Vin
>>> Vin=[12*.95, 12*1.05]         # 12V supply with +/- 5% tol
>>> Rlower=[5000*0.99,5000*1.01]  # 5kohm resistor, +/- 1% tol
>>> Rupper=[7000*.99, 7000*1.01]  # 7kohm resistor, +/- 1% tol
>>> data=[Vdiv(V,Ru,Rl) for V,Ru,Rl in combine(Vin,Rlower,Rupper)]
>>> data
[6.6499999999999995, 6.7053244592346086, 6.5944908180300494, 6.6499999999999995,
 7.3500000000000014, 7.4111480865224637, 7.2886477462437407, 7.3500000000000014]
>>> min(data)
>>> max(data)

For problems which need to check over many combinations, this recipe eliminates nested for loops (or list comprehensions with multiple for's) and extends to many variables without changing the structure of the code. Thanks to Python's *args tuple packing way of handling an arbritrary number of arguments and the use of recursion, these functions don't care how many sequences are handed to them. In the dice example, combine makes a list with 36 items (six possible rolls of two dice). In the circuit example, it makes a list of eight items (two choices each in three variables).

I also wrote non-recursive versions but they were harder to understand and slower. If this recipe is used with a very large number of input lists, one would want to think about the recursion limit (see the sys module), but I would expect most uses to involve a modest number of input sequences.

The sequences (other than the first anyway) must allow repeated loops through their values and must give the same values each time in order for the output to be meaningful. If a simple generator is handed in, it will only go through it's values once and you won't get it in full combination with the other sequences. Other sequences related to files or network sources may also not do the same thing every time. For these type of sequences, wrapping it with list(...) will give combine a list that it can loop over multiple times, creating proper results.


Hamish Lawson 12 years, 7 months ago  # | flag

Why not provide just the generator version? Isn't a list-returning version somewhat redundant? If a list is required it can be easily obtained by materialising the output of xcombine: list(xcombine((1,2),(3,4))). (And if you have just this one version, you could rename it to "combine").

David Klaffenbach (author) 12 years, 7 months ago  # | flag

Either way is OK. Thanks for the comment.

I wrote combine first and then wrote xcombine for completeness. I've used combine for a couple of things and haven't had an application that would have been better off using xcombine.

I did a couple of crude tests with large inputs, and in these cases combine is roughly twice as fast as list(xcombine). On the other hand, if you only need to loop over the combinations and never create the whole list, it looks like xcombine is twice as fast as combine. On the third hand, for lots of combining of just a handful of small sequences, combine is faster than xcombine, even when you don't need the list! Of course if the number of combinations is really large, xcombine should save some memory. I'd say if speed matters, test with your inputs on your system.

As for the names, I was just following the range, xrange precedent.

jackdied 12 years, 6 months ago  # | flag

Cartesian Product. This is known as a cartesian product[1]

I wrote a C module to do this quickly a couple years ago which still works with current versions of python[2].

Check comp.lang.python for discussions about combinations and permutations, a thread starts about once a year to find the fastest way to do combinatorics in pure-python.

[1] http://mathworld.wolfram.com/CartesianProduct.html

[2] http://probstat.sourceforge.net/

David Klaffenbach (author) 12 years, 6 months ago  # | flag

So I see ... Now that I know the proper name, I was able to find some interesting versions, including one that is very similar to what I did, one that shrinks it into a few lines using reduce and lambda, and one from Tim Peters that I need to think about a bit more in order to understand!

Andy Elvey 12 years, 6 months ago  # | flag

Very interesting example! Mmmm .... I like this. I can imagine this approach being used in a parser, where you have a sequence/stream of tokens, each of which can be followed by another sequence. ( A finite state machine may be another way of doing this, but this recipe certainly gives me some very useful ideas! Thanks for this, David.)

David Klaffenbach (author) 6 years, 7 months ago  # | flag

In Python 2.6 or later one can just use itertools.product

Add a comment

Sign in to comment

Created by David Klaffenbach on Sun, 29 Aug 2004 (PSF)
Python recipes (4552)
David Klaffenbach's recipes (5)

Required Modules

  • (none specified)

Other Information and Tasks