Welcome, guest | Sign In | My Account | Store | Cart

On the rare occasion that you want to fill the sequences passed to zip() with a padding value, at least use something fast. You can optionally specify a padding value other than None.

Python, 50 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50``` ```""" >>> list(zip_pad([], [1], [1,2])) [(None, 1, 1), (None, None, 2)] >>> list(zip_pad([], [1], [1,2], pad=42)) [(42, 1, 1), (42, 42, 2)] >>> list(zip_pad([], [])) [] >>> list(zip_pad([1], [2])) [(1, 2)] >>> list(zip_pad([1,2], [])) [(1, None), (2, None)] >>> list(zip_pad([1], [2])) [(1, 2)] >>> list(zip_pad([1,2], [])) [(1, None), (2, None)] >>> list(zip_pad([1,2], [3,4])) [(1, 3), (2, 4)] >>> list(zip_pad([1,2], [10,20,30], [100,200,300,400])) [(1, 10, 100), (2, 20, 200), (None, 30, 300), (None, None, 400)] """ from itertools import izip, chain def zip_pad(*iterables, **kw): if kw: assert len(kw) == 1 pad = kw["pad"] else: pad = None done = [len(iterables)-1] def pad_iter(): if not done[0]: return done[0] -= 1 while 1: yield pad iterables = [chain(seq, pad_iter()) for seq in iterables] return izip(*iterables) if __name__ == "__main__": import doctest doctest.testmod() ```

The trick here is that the check whether all iterables are exhausted is performed only once per iterable, where a naive implementation would check once per iteration. Of course there are per-iteration checks, but these are hidden in chain()/izip() and profit from the itertools module's fast implementation in C.

This recipe is inspired by code written by Andrew Dalke, as posted on comp.lang.python: http://mail.python.org/pipermail/python-list/2005-July/292146.html.

Zoran Isailovski 15 years, 3 months ago

map is a simpler, though less general alternative. I'd like to mention that "map(None, *iterables)" already does the job for most cases (padding Nones). So, if the result does not have to be an iterable (i.e. I am not dealing with huge data amounts), I'd probably prefer using map.

I am adding this, since you mentioned you wanted an alternative for zip that pads values and is fast. "map(None, *iterables)" is both, and is readily available with python.

Raymond Hettinger 15 years, 3 months ago

izip_longest(). FWIW, here is my version of a padding zipper:

``````def izip_longest(*args, **kwds):
fillvalue = kwds.get('fillvalue')
its = [chain(it, repeat(fillvalue)).next for it in args]
term = [fillvalue] * len(args)
while 1:
result = [g() for g in its]
if result == term: return
yield tuple(result)

print list(izip_longest('x', 'abc', 'ABCDEF', '1', fillvalue=999))
``````
Peter Otten (author) 15 years, 3 months ago

This may fail... as it relies on a non-unique sentinel:

``````>>> list(izip_longest('x-', 'a-bc', 'A-BCDEF', '1', fillvalue="-"))
[('x', 'a', 'A', '1')]
``````
Raymond Hettinger 15 years, 1 month ago

A C-speed version using itertools.

``````def izip_longest(*args, **kwds):
''' Alternate version of izip() that fills-in missing values rather than truncating
to the length of the shortest iterable.  The fillvalue is specified as a keyword
argument (defaulting to None if not specified).

>>> list(izip_longest('a', 'def', 'ghi'))
[('a', 'd', 'g'), (None, 'e', 'h'), (None, 'f', 'i')]
>>> list(izip_longest('abc', 'def', 'ghi'))
[('a', 'd', 'g'), ('b', 'e', 'h'), ('c', 'f', 'i')]
>>> list(izip_longest('a', 'def', 'gh'))
[('a', 'd', 'g'), (None, 'e', 'h'), (None, 'f', None)]
'''
fillvalue = kwds.get('fillvalue')
def sentinel(counter=[fillvalue]*(len(args)-1)):
yield counter.pop()     # raises IndexError when count hits zero
iters = [chain(it, sentinel(), repeat(fillvalue)) for it in args]
try:
for tup in izip(*iters):
yield tup
except IndexError:
pass
``````
 Created by Peter Otten on Thu, 31 Aug 2006 (PSF)