Welcome, guest | Sign In | My Account | Store | Cart

Simple generator accepts an iterable L and an integer N and yields a series of sub-generators, each of which will in turn yield N items from L.

Python, 30 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30``` ```class groupcount(object): """Accept a (possibly infinite) iterable and yield a succession of sub-iterators from it, each of which will yield N values. >>> gc = groupcount('abcdefghij', 3) >>> for subgroup in gc: ... for item in subgroup: ... print item, ... print ... a b c d e f g h i j """ def __init__(self, iterable, n=10): self.it = iter(iterable) self.n = n def __iter__(self): return self def next(self): return self._group(self.it.next()) def _group(self, ondeck): yield ondeck for i in xrange(1, self.n): yield self.it.next() ```

The task that prompted this recipe was to write out sitemap files (http://sitemaps.org/protocol.php) for a website containing 100,000 pages. Since the spec limits each sitemap.xml file to 10,000 urls, I needed to go through a list of 100,000 items and break it into 10 pages.

The itertools.groupby function provides a general solution by grouping an iterable according to a supplied function. Maybe it was just me, but using groupby to break up the input sequence by count became somewhat baroque: <pre> def countby(it, n=10): from itertools import groupby, imap grouped = groupby(enumerate(it), lambda x: int(x[0]/n)) counted = imap(lambda x:x[1], grouped) return imap(lambda x: imap(lambda y: y[1], x), counted) </pre> Referring to http://docs.python.org/lib/itertools-functions.html, I adapted the python-equivalent code shown for itertools.groupby.

The 'groupcount' generator delivers the same result in a more understandable (for me anyway) package.

lotr py 15 years, 7 months ago

Wade Leftwich 13 years, 8 months ago

We live, we learn. Someone else posted this much simpler way to accomplish the same task; I can't find his or her post now, or I would give proper credit.

from itertools import groupby, count

def batcher(seq, n): counter = count() return (y for (x,y) in groupby(iter(seq), lambda x: counter.next() // n))

``````>>> for batch in batcher('abcdefgh', 3):
print list(batch)
...
['a', 'b', 'c']
['d', 'e', 'f']
['g', 'h']
``````

#### Note that each batch must be consumed before the next is retrieved

 Created by Wade Leftwich on Tue, 19 Feb 2008 (PSF)

### Required Modules

• (none specified)