Terminology: To me and some (most?) others posting here and, I believe,
both the docs and the common meaning of 'generator', a Python generator is
the particular kind of iterator that produces multiple values on request
and which is created by the generator function that you write.
Acceptors (consumers): The concept is known to both CS and Python
developers. According to Tim Peters, Python generators constitute
'semi-coroutines'. Any full coroutine mechanism (Stackless, for instance)
will allow the reverse (not inverse, which would undo rather than
complement). For various reasons, the Python developers choose that data
chains should be written as consumer code getting data from a generator,
which might in turn get data from a generator.
In your example, opening and reading the lines of the data file could be
done by filterjunk, not the main function, but I can see reasons both ways.
For your specific problem, you could, I believe, use an intermediate
generator (which could also be combined with filterjunk) that combines
lines into complete text records. Something like (ignoring any fussy
details left out):
def textrecord(file):
trlines = []
for line in filterjunk(file):
trlines.append(line)
if complete(trline): yield ''.join(trlines)
# where 'complete' is code to determine if have all lines in record or
not
Terry J. Reedy