3

The function combine takes multiple sequences and creates a list in which each item is constructed from items from each input sequence, and all possible combinations are created. If that description is confusing, look at the example in the docstring. It's a pretty simple transformation. The function xcombine is similar, but returns a generator rather than creating the output all at once.
A trivial example would be to find the distribution of the rolls of two die: range(1,7) nicely lists the possible values of a single roll.
A more substantial usage would be in repeating a calculation or simulation with input parameters varied over tolerance ranges:
For problems which need to check over many combinations, this recipe eliminates nested for loops (or list comprehensions with multiple for's) and extends to many variables without changing the structure of the code. Thanks to Python's *args tuple packing way of handling an arbritrary number of arguments and the use of recursion, these functions don't care how many sequences are handed to them. In the dice example, combine makes a list with 36 items (six possible rolls of two dice). In the circuit example, it makes a list of eight items (two choices each in three variables). I also wrote nonrecursive versions but they were harder to understand and slower. If this recipe is used with a very large number of input lists, one would want to think about the recursion limit (see the sys module), but I would expect most uses to involve a modest number of input sequences. The sequences (other than the first anyway) must allow repeated loops through their values and must give the same values each time in order for the output to be meaningful. If a simple generator is handed in, it will only go through it's values once and you won't get it in full combination with the other sequences. Other sequences related to files or network sources may also not do the same thing every time. For these type of sequences, wrapping it with list(...) will give combine a list that it can loop over multiple times, creating proper results.
Tags: algorithms

6 comments
Add a comment
Sign in to comment
Why not provide just the generator version? Isn't a listreturning version somewhat redundant? If a list is required it can be easily obtained by materialising the output of xcombine: list(xcombine((1,2),(3,4))). (And if you have just this one version, you could rename it to "combine").
Either way is OK. Thanks for the comment.
I wrote combine first and then wrote xcombine for completeness. I've used combine for a couple of things and haven't had an application that would have been better off using xcombine.
I did a couple of crude tests with large inputs, and in these cases combine is roughly twice as fast as list(xcombine). On the other hand, if you only need to loop over the combinations and never create the whole list, it looks like xcombine is twice as fast as combine. On the third hand, for lots of combining of just a handful of small sequences, combine is faster than xcombine, even when you don't need the list! Of course if the number of combinations is really large, xcombine should save some memory. I'd say if speed matters, test with your inputs on your system.
As for the names, I was just following the range, xrange precedent.
Cartesian Product. This is known as a cartesian product[1]
I wrote a C module to do this quickly a couple years ago which still works with current versions of python[2].
Check comp.lang.python for discussions about combinations and permutations, a thread starts about once a year to find the fastest way to do combinatorics in purepython.
[1] http://mathworld.wolfram.com/CartesianProduct.html
[2] http://probstat.sourceforge.net/
So I see ... Now that I know the proper name, I was able to find some interesting versions, including one that is very similar to what I did, one that shrinks it into a few lines using reduce and lambda, and one from Tim Peters that I need to think about a bit more in order to understand!
Very interesting example! Mmmm .... I like this. I can imagine this approach being used in a parser, where you have a sequence/stream of tokens, each of which can be followed by another sequence. ( A finite state machine may be another way of doing this, but this recipe certainly gives me some very useful ideas! Thanks for this, David.)
In Python 2.6 or later one can just use itertools.product