Welcome, guest | Sign In | My Account | Store | Cart

This recipe is yet another combination generator, with an extra twist: the generated combinations can be yielded in blocks of a given size. This comes useful (or necessary) when processing arbitrarily long or infinite iterables in bounded-size blocks.

Python, 107 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107``` ```from itertools import islice, repeat, izip, cycle # If PEP 3102 is accepted, the signature would be #def iterCombinations(*iterables, blocksize=1): def iterCombinations(*iterables, **kwds): '''Generates the combinations of the given iterables. @returns: An iterator over all combinations of the C{iterables}. Each yielded combination is a list of size C{len(iterables)} where the i-th element is drawn from the i-th iterable. B{Important note:} This list should not be modified between iterations; if you need to modify it, copy it to a new container. @param iterables: One or more arbitrary iterables. @param kwds: Currently only 'blocksize' allowed. @keyword blocksize: Determines the order of the yielded combinations. By default (C{blocksize=1}), the first iterable corresponds to the most outer loop (slowest change) and the last iterable corresponds to the most inner loop (fastest change). For larger blocksize, each iterable is first partitioned into consecutive blocks of size C{blocksize} (except perhaps for the last block which may be shorter). Then each combination is yielded by first iterating over each block combination C{B := (B1,B2,..Bn)} and then yielding each combination from B. More generally, C{blocksize} can be an iterable so that different C{iterables} can be partitioned by different block size. In this case, C{blocksize} is repeated as many time as necessary to match C{len(iterables)}. For instance:: iterCombinations(range(4),range(6),range(8),'xyz', blocksize=(2,3)) partitions C{range(4)} and C{range(8)} with blocksize=2, while C{range(6)} and 'xyz' are partitioned with blocksize=3. ''' combo = [None] * len(iterables) blocksize = kwds.get('blocksize', 1) if isinstance(blocksize, int): sizes = repeat(blocksize) else: sizes = cycle(iter(blocksize)) block_lists = [list(_iterblocks(it,sz)) for it,sz in izip(iterables,sizes)] for block_combo in _iterCombinations(block_lists, [None] * len(iterables)): for _ in _iterCombinations(block_combo, combo): yield combo def _iterCombinations(groups, combo_list, index=0): # generate recursively all combinations of groups, updating combo_list # *in-place* for each combination. if index < len(groups)-1: for x in groups[index]: combo_list[index] = x for _ in _iterCombinations(groups,combo_list,index+1): yield combo_list else: # optimization to avoid the last level of recursion assert index == len(groups)-1 for x in groups[index]: combo_list[index] = x yield combo_list def _iterblocks(iterable, blocksize, factory=tuple): # split the iterable into blocks of blocksize iterable = iter(iterable) while True: block = factory(islice(iterable,blocksize)) if not block: break yield block if len(block) < blocksize: break # example >>> for c in iterCombinations(range(5), 'abc'): print c ... [0, 'a'] [0, 'b'] [0, 'c'] [1, 'a'] [1, 'b'] [1, 'c'] [2, 'a'] [2, 'b'] [2, 'c'] [3, 'a'] [3, 'b'] [3, 'c'] [4, 'a'] [4, 'b'] [4, 'c'] >>> for c in iterCombinations(range(5), 'abc', blocksize=2): print c ... [0, 'a'] [0, 'b'] [1, 'a'] [1, 'b'] [0, 'c'] [1, 'c'] [2, 'a'] [2, 'b'] [3, 'a'] [3, 'b'] [2, 'c'] [3, 'c'] [4, 'a'] [4, 'b'] [4, 'c'] ```

There are already several recipes on generating combinations and permutations out of iterables. This one is an efficient implementation with an additional feature: being able to break the input iterables into blocks. The combinations are generated in two stages: at the first level, generate all block combinations, and at the second level generate all combinations from the members of a given block combination.

The motivation was a real-world scenario where each iterable to be combined consisted of file chunks. The total size of the files was too large to fit in memory, so the computation had to be performed in bounded-size blocks. Using this generator, the number of blocks that had to be in memory at the same time was bounded and as a result, the memory swapping was reduced significantly compared to the initial naive generation.

 Created by George Sakkis on Tue, 23 Jan 2007 (PSF)

### Required Modules

• (none specified)