ActiveState Code

Recipe 546526: Page through iterable N items at a time


Simple generator accepts an iterable L and an integer N and yields a series of sub-generators, each of which will in turn yield N items from L.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
class groupcount(object):
    """Accept a (possibly infinite) iterable and yield a succession
    of sub-iterators from it, each of which will yield N values.

    >>> gc = groupcount('abcdefghij', 3)
    >>> for subgroup in gc:
    ...     for item in subgroup:
    ...             print item,
    ...     print
    ...
    a b c
    d e f
    g h i
    j
    """

    def __init__(self, iterable, n=10):
        self.it = iter(iterable)
        self.n = n

    def __iter__(self):
        return self

    def next(self):
        return self._group(self.it.next())

    def _group(self, ondeck):
        yield ondeck
        for i in xrange(1, self.n):
            yield self.it.next()

Discussion

The task that prompted this recipe was to write out sitemap files (http://sitemaps.org/protocol.php) for a website containing 100,000 pages. Since the spec limits each sitemap.xml file to 10,000 urls, I needed to go through a list of 100,000 items and break it into 10 pages.

The itertools.groupby function provides a general solution by grouping an iterable according to a supplied function. Maybe it was just me, but using groupby to break up the input sequence by count became somewhat baroque: <pre> def countby(it, n=10): from itertools import groupby, imap grouped = groupby(enumerate(it), lambda x: int(x[0]/n)) counted = imap(lambda x:x[1], grouped) return imap(lambda x: imap(lambda y: y[1], x), counted) </pre> Referring to http://docs.python.org/lib/itertools-functions.html, I adapted the python-equivalent code shown for itertools.groupby.

The 'groupcount' generator delivers the same result in a more understandable (for me anyway) package.

Comments

  1. 1. At 6:08 a.m. on 19 feb 2008, lotr py said:

    here is another way. ref to Raymond Hettinger @ http://groups.google.com/group/comp.lang.python/browse_thread/thread/4696a3b3e1a6d691/ def grouper(n, iterable, padvalue=None): "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')" return izip([chain(iterable, repeat(padvalue, n-1))]n)

Sign in to comment