Ever been frustrated at having to separately read data and then do minor processing by calling methods on the data? Find yourself writing lambda functions or short generators to do intermediate processing between iterators? If so, you can simplify your programming life by using imapmethod instead. Imapmethod calls a named method on each item it iterates through. For example, you can replace [x.rstrip() for x in iterable], which inefficiently generates the whole list at once before processing begins, or the more efficient imap(lambda x: x.rstrip(), iterable) with imapmethod("rstrip", iterable) or even the provided irstrip(iterable).
This recipe also illustrates some more brain-twisting uses of itertools.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | import itertools
def imapzip(generator, *iterables):
"""
Make an iterator that aggregates the iteration of generator using
arguments aggregated from each of the iterables.
>>> it = imapzip(xrange, (0, 3), (2, 5))
>>> it.next()
(0, 3)
>>> it.next()
(1, 4)
"""
return itertools.izip(*tuple(itertools.starmap(generator, zip(*iterables))))
def imapmethod(methodname, *iterables):
"""
If methodname is a string, make an iterator that calls the method
on each of the iterables named methodname with no
arguments. Otherwise, make an iterator that calls the method on
each of the iterables named by the corresponding element in
methodname.
>>> a = ["aBrA", "cAdAbRa"]
>>> b = ["aLa", "kAzAm"]
>>> it1 = imapmethod("title", a)
>>> it1.next()
'Abra'
>>> it1.next()
'Cadabra'
>>> it2 = imapmethod(("upper", "lower"), a, b)
>>> it2.next()
('ABRA', 'ala')
>>> it2.next()
('CADABRA', 'kazam')
"""
def methodcall(item):
return getattr(item, methodname)()
if isinstance(methodname, str):
if len(iterables) == 1:
return itertools.imap(methodcall, *iterables)
else:
return imapzip(itertools.imap, [methodcall] * len(iterables), iterables)
else:
return imapzip(imapmethod, methodname, iterables)
def curryimapmethod(methodname):
"""
Make a function that will call imapmethod with a preset
methodname.
"""
def imapmethod_call(*iterables):
return imapmethod(methodname, *iterables)
return imapmethod_call
istrip = curryimapmethod("strip")
irstrip = curryimapmethod("rstrip")
ilstrip = curryimapmethod("lstrip")
|
I use imapmethod mostly as a convenience so I don't have to process trailing whitespace off of lines in the body of a loop:
>>> for line in irstrip(file(filename)):
... process(line)
It is indispensable when you need to do some quick processing on the output of one iterator before you feed it to the input of another. For example, the following loop gives each iteration of the loop columns from a csv file after making the line lowercase and stripping whitespace from the end of the line.
>>> import csv
>>> for columns in csv.reader(irstrip(imapmethod("lower", file(filename)))):
... process(columns)
A fun use of imapmethod is to create one-liners to diff files with some kind of processing. These differs will short circuit as soon as a difference is found, saving processing time. A differ built with list comprehensions would have to process the entire file, and then start doing comparisons.
>>> import operator
>>> filelist = ("file1", "file1.upper")
>>> False not in itertools.starmap(operator.eq, imapmethod("upper", *map(file, filelist))) # case-insensitive diff
True
>>> False not in itertools.starmap(operator.eq, irstrip(*map(file, filelist))) # diff without regard to whitespace at the end of the line
False
The special case where len(iterables) == 1 is not really necessary, but I thought that would be the most common case, so it is best to make the execution a little simpler.
You can use more general curry functions, but curryimapmethod is much less complicated.