With the introduction of Python 2.5 it becomes easy to 'unget' a value from a generator, so the the next next() will return it again.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | from __future__ import with_statement
def readFile(filename):
with open(filename, 'r') as f:
for l in f:
if not l.isspace() and not l.startswith('#'):
r = yield l.rstrip()
if r:
# The first yield is needed because 'send' also
# yields a value
yield None
yield l.rstrip()
def processFirstPart(lines):
l = lines.next()
while l.isdigit():
print l
l = lines.next()
# The last line was not a digit, but that doesn't mean we can
# simply discard it.
lines.send(1)
def processSecondPart(lines):
l = lines.next()
assert l == 'START'
try:
while True:
print l
l = lines.next()
except StopIteration:
pass
lines = readFile('example.in')
processFirstPart(lines)
print 'Going to the second part.'
processSecondPart(lines)
|
Although writing your own class with __iter__ and next-methods, can also solve this problem, I think this is an easier solution for a common issue. Suppose one has to process or parse a simple file line by line. Some sections in the file might be optional, or some might not have an 'end marker', but are implicitly ended because another section begins.
It's not possible to peek at a generator to see which value is next, nor is it possible to 'unget' a value, But, since Python 2.5 you can send a value into the generator (send also yields a value though!). This is used in the above example to notify the generator it should output the last line again. (Note that it doesn't matter what is sent, as long it's truth value is True).
This way one can get a line from the generator, call send(), look at it, and call the appropriate subroutine to process the remainder. A subroutine can also 'put a line back' if it detects it's not for that routine anymore.
An example file would be: 1 2 3 START 4 5
The START is first read by 'processFirstPart', which it shouldn't do, because that line should be read by 'processSecondPart'. It can now do a send() to the generator, and return, and it will look like it never read that line.
Cleanup. You should use
so the same value can be pushed back more than once.
The last line of processFirstPart() should be
Re: Cleanup. The 'while' could be useful if you need to peek several times in succession, but I think this is a less common case -- I thought about it though. Using send(l) in stead of send(1) doesn't really matter. You could use send(True) or send([1]), as long as its value is True. It's merely a sign that you want the last line again. The value you send into the generator isn't used in any specific way.
You could even have that the line you read and wanted to push back is the empty string '', in that case you would not want to send that one back -- or you would want to write if r != None.