On the rare occasion that you want to fill the sequences passed to zip() with a padding value, at least use something fast. You can optionally specify a padding value other than None.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | """
>>> list(zip_pad([], [1], [1,2]))
[(None, 1, 1), (None, None, 2)]
>>> list(zip_pad([], [1], [1,2], pad=42))
[(42, 1, 1), (42, 42, 2)]
>>> list(zip_pad([], []))
[]
>>> list(zip_pad([1], [2]))
[(1, 2)]
>>> list(zip_pad([1,2], []))
[(1, None), (2, None)]
>>> list(zip_pad([1], [2]))
[(1, 2)]
>>> list(zip_pad([1,2], []))
[(1, None), (2, None)]
>>> list(zip_pad([1,2], [3,4]))
[(1, 3), (2, 4)]
>>> list(zip_pad([1,2], [10,20,30], [100,200,300,400]))
[(1, 10, 100), (2, 20, 200), (None, 30, 300), (None, None, 400)]
"""
from itertools import izip, chain
def zip_pad(*iterables, **kw):
if kw:
assert len(kw) == 1
pad = kw["pad"]
else:
pad = None
done = [len(iterables)-1]
def pad_iter():
if not done[0]:
return
done[0] -= 1
while 1:
yield pad
iterables = [chain(seq, pad_iter()) for seq in iterables]
return izip(*iterables)
if __name__ == "__main__":
import doctest
doctest.testmod()
|
The trick here is that the check whether all iterables are exhausted is performed only once per iterable, where a naive implementation would check once per iteration. Of course there are per-iteration checks, but these are hidden in chain()/izip() and profit from the itertools module's fast implementation in C.
This recipe is inspired by code written by Andrew Dalke, as posted on comp.lang.python: http://mail.python.org/pipermail/python-list/2005-July/292146.html.
map is a simpler, though less general alternative. I'd like to mention that "map(None, *iterables)" already does the job for most cases (padding Nones). So, if the result does not have to be an iterable (i.e. I am not dealing with huge data amounts), I'd probably prefer using map.
I am adding this, since you mentioned you wanted an alternative for zip that pads values and is fast. "map(None, *iterables)" is both, and is readily available with python.
See also http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/410687
izip_longest(). FWIW, here is my version of a padding zipper:
This may fail... as it relies on a non-unique sentinel:
A C-speed version using itertools.