Two approaches to generate all combination of elements from a number of sets. Compares "classic" statically coded algorithms to a pythonic on-the-fly generation of specific algorithm code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | ##from __future__ import generators # for Python 2.2
######################################################################
##
## public interface
##
######################################################################
# SetElemCombinations.py
def ByNestedComprehnsionAlogrithm(*sets):
"""Returns a list of all element combinations from the given sets.
A combination is represented as a tuple, with the first tuple
element coming from the first set, the second tuple element
coming from the second set and so on.
A set may be any iterable to which the not operator is applicable.
"""
g = []
for set in sets:
if not set: return []
if g:
g = [ i + (j,) for i in g for j in set ]
else:
g = [(j,) for j in set]
return g
def ByRecursiveGeneratorAlrgorithm(*sets):
"""Returns a generator that yields one tuple per element combination.
A set may be any iterable to which the not operator is applicable.
"""
if not sets: return
def calc(sets):
head, tail = sets[0], sets[1:]
if not tail:
for e in head:
yield (e,)
else:
for e in head:
for t in calc(tail):
yield (e,) + t
return calc(sets)
def ByDynamicallyGeneratedCode(*sets):
if not sets: return []
F = MakeListComprehensionFunction ('F', len(sets))
return F(*sets)
def MakeListComprehensionFunction (name, nsets):
"""Returns a function applicable to exactly <nsets> sets.
The returned function has the signature
F(set0, set1, ..., set<nsets>)
and returns a list of all element combinations as tuples.
A set may be any iterable object.
"""
if nsets <= 0:
source = 'def %s(): return []\n' % name
else:
constructs = [ ('set%d'%i, 'e%d'%i, 'for e%d in set%d'%(i,i))
for i in range(nsets) ]
a, e, f = map(None, *constructs)
##e.reverse() # <- reverse ordering of tuple elements if needed
source = 'def %s%s:\n return [%s %s]\n' % \
(name, _tuplestr(a), _tuplestr(e), ' '.join(f))
scope = {}
exec source in scope
return scope[name]
def _tuplestr(t):
if not t: return '()'
return '(' + ','.join(t) + ',)'
######################################################################
##
## demo
##
######################################################################
ListOfCombinations = ByDynamicallyGeneratedCode
GenerateCombinations = ByRecursiveGeneratorAlrgorithm
if __name__ == '__main__':
print
print 'All possible 3-bit binary numbers:'
all = ListOfCombinations(*[(0,1)]*3)
print '\t', ', '.join([ ''.join(map(str, digits)) for digits in all ])
numbers = [ reduce(lambda n, d: n << 1 | d, digits, 0)
for digits in all ]
print '\t', ', '.join(map(str, numbers))
print 'Some controverse statements:'
all = ListOfCombinations( ('python', 'ruby'),
('is cool', 'is ok', 'sucks') )
for each in all:
print '\t', ' '.join(each).capitalize()
# When you need several combinations from N (possibly different) sets
# increase speed by obtaining a function for N sets ...
print 'Several combinations from 2 sets:'
F = MakeListComprehensionFunction('F', 2)
print '\t', F( (1,2,3), (10,20) )
print '\t', F( '+-', 'x' )
# When you need to iterate through lots of combinations,
# save memory using a generator ...
print 'Some binary operations:'
G = GenerateCombinations( 'ab', '+-*/', 'x' )
for each in G:
print '\t', ' '.join(each)
print 'Some binary operations calculated by ByNestedComprehnsionAlogrithm:'
G = ByNestedComprehnsionAlogrithm( 'ab', '+-*/', 'x' )
for each in G:
print '\t', ' '.join(each)
|
While python is a highly dynamic language and lends itself well toward generic, flexible algorithm implementations, decreased run-time efficiency is often the price for computational generality.
In this special case, if the number of sets was statically known, the fastest implementation would be a nested for loop with a pre-allocated result list (slightly faster then a nested list comprehension). But since it is not, different, more tricky algorithms are required, aren't they?
The recipe shows two "classic" approaches. The first makes use of comprehensions. It creates a list of one-element tuples from the elements of the first set, and then keeps extending the tuples from the elements of the next set. The second one recursively creates a generator and iterates through it to accomplish the same result. Both functions are noticeably slower then a specific nested loop, the second one by more then 40% on Python 2.2.2 and by almost 85% on Python 2.3.4 (on which the other algorithms run much faster, hence the relative drop).
Similar algorithms are provided by the recipes "Generator for permutations, combinations, selections of a sequence" by Ulrich Hoffmann and "Generating combinations of objects from multiple sequences" by David Klaffenbach.
In addition, this recipe also shows an alternative "non-classic" approach only feasible with highly dynamic languages like Python. It dynamically generates a function that implements a nested comprehension with as many nesting levels as the number of sets, and then calls it to actually compute the combinations. This one, of course, runs at a comparable speed as a manually coded nested loop. (Note that generating a generator function is equally simple, but python limits the number of statically nested loops to 20.)
So can we conclude from this that Python resolves the opposition between computational generality and run-time efficiency? Not in general.
Generating functions dynamically comes at a price, too: The practical impossibility to debug (or even read) the generated code. However, when speed is crucial, and the generated code pattern is fairly simple, like in this recipe, the approach may be very well worth a consideration.
Cheers and happy combining!
Interesting viewpoint. Interesting viewpoint. Instead of tricking around with a generic solution, generate the specific one on thy fly. I wonder what implications this might have.