Welcome, guest | Sign In | My Account | Store | Cart

There are a few types in Python—most notably, files—that are both iterators and context managers. For trivial cases, these features are easy to use together, but as soon as you need to use the iterator lazily or asynchronously, a with statement won't help. That's where this recipe comes in handy:

send_async(with_iter(open(path, 'r')))

This also allows you to "forward" closing for a wrapped iterator, so closing the outer iterator also closes the inner one:

sync_async(line.upper() for line in with_iter(open(path, 'r')))
Python, 9 lines
1
2
3
4
5
6
7
8
9
def with_iter(iterable):
    """Wrap an iterable in a with statement, so it closes when consumed.

    >>> uplines = (line.upper() for line in with_iter(open(path, 'r')))
    >>> print('\n'.join(uplines))
    """
    with iterable:
        for item in iterable:
            yield item

To see the problem with using context managers and iterators together, look at this very simple example:

with open(path, 'r') as 'f':
    uplines = (line.upper() for line in f)
print('\n'.join(uplines))

That raises a ValueError in the print line, because you're trying to use a closed file as an iterator.

In this simple case, of course, you can just move the print under the with block.

But often, you want to store an iterator and use it lazily—especially in asynchronous coding. For example, imagine a send_async method that takes a sequence of lines and sends them on a socket whenever it's ready:

with open(path, 'r') as f:
    send_async(uplines)

The file is closed as soon as send_async returns—so when the actual send happens (next time through the event loop, on another thread, whatever), it will raise a ValueError.

This is even worse if your send_async tries to write synchronously, and doesn't do any async work unless the sink isn't ready, or it only achieves a partial write, because then your code may seem to work in testing but fail in real life.

But you can't just do this:

send_async(open(path, 'r'))

That will leak the file until the GC finds it, which could lead to running out of file handles, unpredictably failing to access the file because of a Windows exclusive lock, etc.

Often people try to write solutions that close the file explicitly at the appropriate point, but this will break as soon as you try to pass the file through a filter:

send_async(line.upper() for line in open(path, 'r'))

send_async(my_filter_function(open(path, 'r')))

The close will appear to work, but it will only close the generator, not the file. You may not notice the problem in CPython, because the generator will typically be the last reference to the file… but then it fails unpredictably in any other Python implementation.

The right solution is verbose, and buries what your code is actually doing under irrelevant mechanics:

def gen():
    with open(path, 'r') as f:
        for line in f:
            yield line
send_async(gen())

The with_iter function solves these problems, by wrapping an iterator so that it will be closed whenever it's exhausted, or the outer iterator is closed, or deleted. You can just write:

send_async(with_iter(open(path, 'r')))

send_async(line.upper() for line in with_iter(open(path, 'r')))

send_async(my_filter_function(with_iter(open(path, 'r'))))

In more complex cases, this doesn't stop you from giving the intermediate result a name, and the fact that you don't have a generator function to name doesn't seem to be a problem:

def upper_file():
    with open(path, 'r') as f:
        for line in f:
            yield line.upper()
send_async(upper_file())

def gen():
    with open(path, 'r') as f:
        for line in f:
            yield line.upper()
upper_file = gen()
send_async(upper_file)

upper_file = (line.upper() for line in with_open(path, 'r')))
send_async(upper_file)

file = with_open(path, 'r')
send_async(line.upper() for line in file)

file = with_open(path, 'r')
upper_file = (line.upper() for line in file)
send_async(upper_file)

The first version isn't any more explicit or readable than the last four, but it's more verbose, and easier to get wrong.

If you're only dealing with files, the iopen recipe provides a similar solution that works all the way back to Python 2.2, but it doesn't work in 3.x, and it's much more complicated.

If you can rely on 2.6+, and you only need to use the file as an iterator, you can define iopen as:

def iopen(name, mode='rU', buffering = -1):
    return with_iter(open(name, mode, buffering))