Welcome, guest | Sign In | My Account | Store | Cart

This recipe is yet another combination generator, with an extra twist: the generated combinations can be yielded in blocks of a given size. This comes useful (or necessary) when processing arbitrarily long or infinite iterables in bounded-size blocks.

Python, 107 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
from itertools import islice, repeat, izip, cycle

# If PEP 3102 is accepted, the signature would be
#def iterCombinations(*iterables, blocksize=1):
def iterCombinations(*iterables, **kwds):
    '''Generates the combinations of the given iterables.
    
    @returns: An iterator over all combinations of the C{iterables}. Each yielded
        combination is a list of size C{len(iterables)} where the i-th element
        is drawn from the i-th iterable. B{Important note:} This list should not
        be modified between iterations; if you need to modify it, copy it to a
        new container.
    
    @param iterables: One or more arbitrary iterables.
    @param kwds: Currently only 'blocksize' allowed.
    @keyword blocksize: Determines the order of the yielded combinations. By
        default (C{blocksize=1}), the first iterable corresponds to the most outer
        loop (slowest change) and the last iterable corresponds to the most inner
        loop (fastest change).
        
        For larger blocksize, each iterable is first partitioned into consecutive
        blocks of size C{blocksize} (except perhaps for the last block which may
        be shorter). Then each combination is yielded by first iterating over
        each block combination C{B := (B1,B2,..Bn)} and then yielding each
        combination from B.
        
        More generally, C{blocksize} can be an iterable so that different
        C{iterables} can be partitioned by different block size. In this case,
        C{blocksize} is repeated as many time as necessary to match C{len(iterables)}.
        For instance::
            iterCombinations(range(4),range(6),range(8),'xyz', blocksize=(2,3))
        partitions C{range(4)} and C{range(8)} with blocksize=2, while C{range(6)}
        and 'xyz' are partitioned with blocksize=3.
    '''
    combo = [None] * len(iterables)
    blocksize = kwds.get('blocksize', 1)
    if isinstance(blocksize, int):
        sizes = repeat(blocksize)
    else:
        sizes = cycle(iter(blocksize))
    block_lists = [list(_iterblocks(it,sz)) for it,sz in izip(iterables,sizes)]
    for block_combo in _iterCombinations(block_lists, [None] * len(iterables)):
        for _ in _iterCombinations(block_combo, combo):
            yield combo


def _iterCombinations(groups, combo_list, index=0):
    # generate recursively all combinations of groups, updating combo_list
    # *in-place* for each combination.    
    if index < len(groups)-1:
        for x in groups[index]:
            combo_list[index] = x
            for _ in _iterCombinations(groups,combo_list,index+1):
                yield combo_list
    else: # optimization to avoid the last level of recursion
        assert index == len(groups)-1
        for x in groups[index]:
            combo_list[index] = x
            yield combo_list

        
def _iterblocks(iterable, blocksize, factory=tuple):
    # split the iterable into blocks of blocksize
    iterable = iter(iterable)
    while True:
        block = factory(islice(iterable,blocksize))
        if not block: break
        yield block
        if len(block) < blocksize: break


# example
>>> for c in iterCombinations(range(5), 'abc'): print c
...
[0, 'a']
[0, 'b']
[0, 'c']
[1, 'a']
[1, 'b']
[1, 'c']
[2, 'a']
[2, 'b']
[2, 'c']
[3, 'a']
[3, 'b']
[3, 'c']
[4, 'a']
[4, 'b']
[4, 'c']

>>> for c in iterCombinations(range(5), 'abc', blocksize=2): print c
...
[0, 'a']
[0, 'b']
[1, 'a']
[1, 'b']
[0, 'c']
[1, 'c']
[2, 'a']
[2, 'b']
[3, 'a']
[3, 'b']
[2, 'c']
[3, 'c']
[4, 'a']
[4, 'b']
[4, 'c']

There are already several recipes on generating combinations and permutations out of iterables. This one is an efficient implementation with an additional feature: being able to break the input iterables into blocks. The combinations are generated in two stages: at the first level, generate all block combinations, and at the second level generate all combinations from the members of a given block combination.

The motivation was a real-world scenario where each iterable to be combined consisted of file chunks. The total size of the files was too large to fit in memory, so the computation had to be performed in bounded-size blocks. Using this generator, the number of blocks that had to be in memory at the same time was bounded and as a result, the memory swapping was reduced significantly compared to the initial naive generation.

Created by George Sakkis on Tue, 23 Jan 2007 (PSF)
Python recipes (4591)
George Sakkis's recipes (26)

Required Modules

  • (none specified)

Other Information and Tasks