Mon, 8 Mar 2004 23:34:25

[alice] > ..> <image>> light bulb turning on above alice's head> </image>>> <python>> def powset(seq):> if seq:> head, tail = seq[:1], seq[1:]> for smaller in powset(tail):> yield smaller> yield head + smaller> else:> yield []>> for s in powset([1, 2, 3]):> print s> </python>>> This is one of them iterator-thingy-ma-bobs, isn't it? hmmmm..... It's called a generator. It's always easy to turn a generator into a function that materializes the entire result list in one gulp; for example, we could write the function above like so: def powset2(seq): result = [] if seq: head, tail = seq[:1], seq[1:] for smaller in powset(tail): result.append(smaller) result.append(head + smaller) else: result.append([]) return result IOW, everywhere we "yield"ed a result in the generator version, in the giant-result-list version we just append it to a result list instead. The joy of a generator is that it produces intermediate results one at a time. If, for example, len(seq)==60, there's not enough computer memory on Earth to hold the entire powerset at once, but the generator version doesn't need to. There's a lot more info about generators in PEP 255: http://www.python.org/peps/pep-0255.html Once you're used to them, they're almost always exactly what you want for, umm, *generating* a sequence of combinatorial objects (powersets, combinations, permutations, etc). [Tim] >> There isn't really a canonical way -- "whatever works" is fine..... [alice] > I kind of thought that, with respect to algorithms, "canonical" and> "most concise" were the same thing? Surely python programmers, of all> people, are not of the attitude that: "whatever works is fine"! Happy Python programmers generally prefer clarity to conciseness, and the two aren't always the same. If you're generating powersets just for the intellectual challenge, then the definition of "works" is pretty loose. But, for example, if you're generating powersets for "a real reason", perhaps you *need* to generate them in increasing size. Then the definition of "works" gets harder to meet. > In retrospect it does seem like a very natural way of thinking about> the problem. I wonder why I didn't see it myself?> Perhaps because I knew that I _could_ find the subsets in order of> smallest to largest, I felt that somehow I _should_. The *output* is prettier that way. You could try to think about that recursively too: suppose you had the powerset of S, and wanted the powerset of S *plus* a new element x, and wanted to generate the elements in increasing order of size. Well, we could first sort the powerset of S by size. The powerset of T is again the powerset of S, plus the powerset of S fiddled to add x to each element. Adding x to an element increases its size by 1. So we first want to yield the size-0 elements of the powerset of S, then the size-1 elements of the powerset of S, then the size-0 elements of the powerset of S fiddled to add x to each, then the size-2 elements of the powerset of S, then the size-1 elements of the powerset of S fiddled to add x to each, and so on. Here's a program to do that: def powset_bysize(seq): if seq: head, tail = seq[:-1], seq[-1:] segregated_by_size = [[] for dummy in range(len(seq))] for x in powset_bysize(head): segregated_by_size[len(x)].append(x) last_one = [] for sized_list in segregated_by_size: for x in sized_list: yield x for x in last_one: yield x + tail last_one = sized_list for x in last_one: yield x + tail else: yield [] Then powset_bysize([1, 2, 3]) generates in this order: [] [1] [2] [3] [1, 2] [1, 3] [2, 3] [1, 2, 3] It's not nearly as *clear* a program, though, IMO. Maybe you can find a way to make it prettier. It also loses much of the memory advantage of generators, because the topmost invocation ends up materializing the entire powerset of seq[:-1] (which is half the size of the powerset of seq) at once. A more natural approach could follow from the one you started with: take "generate all the subsets of size i" as a problem in its own right (that's called "a combination", btw). Then, assuming you have a generator gen_subsets_of_size(set, size) the powerset generator is just def gen_powset_by_size(set): for size in range(len(set) + 1): for x in gen_subsets_of_size(set, size): yield x That has a different kind of beauty, because gen_subsets_of_size() is likely to be a useful generator on its own, and it's always lovely to reuse code! In the concise department, here's a shorter way to write a giant-result-list version: def powset3(seq): if seq: p = powset3(seq[1:]) return p + [x + seq[:1] for x in p] else: return [[]] If you want to get *really* concise <wink>, here's a tiny non-recursive generator version, using the bit-fiddling trick you discovered earlier: def powset4(seq): pairs = [(2**i, x) for i, x in enumerate(seq)] for i in xrange(2**len(pairs)): yield [x for (mask, x) in pairs if i & mask] Are these getting clearer as they get shorter? I don't think so. > hmmmm...> Anyway, thanks for guiding me from the first apparent solution to the> beautiful one! It's just decades of practice <wink>.

Recent Messages in this Thread | ||
---|---|---|