Welcome, guest | Sign In | My Account | Store | Cart

This is a simple recipie to iterate through data as a series of chunks of a given size.

Python, 7 lines
1
2
3
4
5
6
7
def chunks(thing, chunk_length):
    """Iterate through thing in chunks of size chunk_length.

    Note that the last chunk can be smaller than chunk_length.
    """
    for i in xrange(0, len(thing), chunk_length):
        yield thing[i:i+chunk_length]

I often find that I need to treat data as a series of chunks of a given size. Sometimes I want N lines of a file at a time, sometimes I'm feeding objects to systems that slow way down when they have too many objects loaded at once, etc. I'm sure I'm not the first one to come up with this recipie!

Note that the last chunk can be smaller than chunk_size. Once or twice, I've wanted to skip such chunks, so I've used something like <pre> def chunks(thing, chunk_length, strict_sizing=True): """Iterate through thing in chunks of size chunk_length

Note that the last chunk can be smaller than chunk_length.
If strict_sizing is True, we won't yield that chunk.

>>> for i in chunks(range(10),4,strict_sizing=False): print i
...
[0, 1, 2, 3]
[4, 5, 6, 7]
[8, 9]
>>> for i in chunks(range(10),4,strict_sizing=True): print i
...
[0, 1, 2, 3]
[4, 5, 6, 7]
>>>
"""
for i in xrange(0, len(thing), chunk_length):
    chunk = thing[i:i+chunk_length]
    if not strict_sizing:
        yield chunk
    elif len(chunk) == chunk_length:
        yield chunk
    else:
        raise(StopIteration)

</pre>

The above recipies assume you can index into thing. If you can't, you might want something like this

<pre> def chunks(thing, chunk_length, strict_sizing=True): """Iterate through thing in chunks of size chunk_length

Note that the last chunk can be smaller than chunk_length.
If strict_sizing is True, we won't yield that chunk.

>>> for i in chunks(xrange(10),4,strict_sizing=False): print i
...
[0, 1, 2, 3]
[4, 5, 6, 7]
[8, 9]
>>> for i in chunks(xrange(10),4,strict_sizing=True): print i
...
[0, 1, 2, 3]
[4, 5, 6, 7]
>>>
"""
chunk = []
for i in thing:
    chunk.append(i)
    if len(chunk) == chunk_length:
        yield chunk
        chunk = []
if chunk and not strict_sizing:
    yield chunk

</pre>

1 comment

Michael Lerner (author) 19 years, 6 months ago  # | flag

more ways with itertools. Thanks .. those itertools versions are definately nice, especially the first one. I also noticed this similar recipe using islice:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/303279

Created by Michael Lerner on Tue, 7 Sep 2004 (PSF)
Python recipes (4591)
Michael Lerner's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks