ActiveState Code

Recipe 66448: Iterator Utilities


A collection of small utility functions for iterators (all functions can also be used with normal sequences). Among other things, the module provides generator ("lazy") versions of the built-in sequence-manipulation functions. The generators can be combined to produce a more specialised iterator.

Python
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
from __future__ import generators, nested_scopes

def itercat(*iterators):
    """Concatenate several iterators into one."""
    for i in iterators:
        for x in i:
            yield x

def iterwhile(func, iterator):
    """Iterate for as long as func(value) returns true."""
    iterator = iter(iterator)
    while 1:
        next = iterator.next()
        if not func(next):
            raise StopIteration
        yield next

def iterfirst(iterator, count=1):
    """Iterate through 'count' first values."""
    iterator = iter(iterator)
    for i in xrange(count):
        yield iterator.next()

def iterstep(iterator, n):
    """Iterate every nth value."""
    iterator = iter(iterator)
    while 1:
        yield iterator.next()
        # skip n-1 values
        for dummy in range(n-1):
            iterator.next()

def itergroup(iterator, count):
    """Iterate in groups of 'count' values. If there
    aren't enough values, the last result is padded with
    None."""
    iterator = iter(iterator)
    values_left = [1]
    def values():
        values_left[0] = 0
        for x in range(count):
            try:
                yield iterator.next()
                values_left[0] = 1
            except StopIteration:
                yield None
    while 1:
        value = tuple(values())
        if not values_left[0]:
            raise StopIteration
        yield value
    
def xzip(*iterators):
    """Iterative version of builtin 'zip'."""
    iterators = map(iter, iterators)
    while 1:
        yield tuple([x.next() for x in iterators])

def xmap(func, *iterators):
    """Iterative version of builtin 'map'."""
    iterators = map(iter, iterators)
    values_left = [1]
    def values():
        # Emulate map behaviour, i.e. shorter
        # sequences are padded with None when
        # they run out of values.
        values_left[0] = 0
        for i in range(len(iterators)):
            iterator = iterators[i]
            if iterator is None:
                yield None
            else:
                try:
                    yield iterator.next()
                    values_left[0] = 1
                except StopIteration:
                    iterators[i] = None
                    yield None
    while 1:
        args = tuple(values())
        if not values_left[0]:
            raise StopIteration
        yield func(*args)

def xfilter(func, iterator):
    """Iterative version of builtin 'filter'."""
    iterator = iter(iterator)
    while 1:
        next = iterator.next()
        if func(next):
            yield next

def xreduce(func, iterator, default=None):
    """Iterative version of builtin 'reduce'."""
    iterator = iter(iterator)
    try:
        prev = iterator.next()
    except StopIteration:
        return default
    single = 1
    for next in iterator:
        single = 0
        prev = func(prev, next)
    if single:
        return func(prev, default)
    return prev

Comments

  1. 1. At 6:37 a.m. on 17 aug 2001, Martin Sjogren said:

    Redundancy? Surely the builtins map(), filter() and reduce() will use iterators by default in Python 2.2? And what about the for loop? I was under the impression that "for x in seq" would in fact execute "for x in iter(seq)"?

  2. 2. At 6:02 a.m. on 21 aug 2001, Sami Hangaslammi (the author) said:

    I don't think so. The original builtins cannot be changed, since that would break a lot of existing code that expects lists from those functions.

  3. 3. At 3:40 p.m. on 8 mar 2002, Jonathan Rogers said:

    These are indeed mostly redundant. I believe Martin is correct; python internals that iterated over sequences now use the iterator protocol.

    "In 2.2, Python's for statement no longer expects a sequence; it expects something for which iter() will return an iterator. For backward compatibility and convenience, an iterator is automatically constructed for sequences that don't implement __iter__() or a tp_iter slot, so for i in [1,2,3] will still work. Wherever the Python interpreter loops over a sequence, it's been changed to use the iterator protocol."

    --What's New in Python 2.2 (A. M. Kuchling)

    I experimented and found that I could indeed use my own trivial iterator classes (as long as they provide __iter__ which returns self) with for loops, list comprehensions, map, filter, and reduce.

Sign in to comment