Python tuple unpacking works only for fixed length sequences, that is one cannot write something like: for (x,y,z=0,*rest) in [(1,2,3), (4,5), (6,7,8,9,10)]: print x,y,z,rest
This recipe returns a pad function that implements this functionality (albeit with less concise syntax). Using the recipe, the example above then can be written as: pad = padfactory(minLength=2,defaults=(0,),extraItems=True) for x,y,z,rest in map(pad, [(1,2,3), (4,5), (6,7,8,9,10)]): print x,y,z,rest
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | def demo():
for firstname,lastname,age,skills in map(
padfactory(minLength=1, defaults=["",0], extraItems=False),
[("Tony", "Caley", 23),
("Mike", "Balboa"),
["Aristotle"],
("Nick", "Fulton", 31, "python", "C++", "oracle"),]):
print "%s %s (%d): %s" % (firstname, lastname, age,
' '.join(skills) or None)
def padfactory(minLength=0, defaults=(), extraItems=False):
'''Return a function that can pad variable-length iterables.
The returned function f:iterable -> iterable attempts to return an
iterable of the same type as the original; if this fails it returns a list.
@param minLength: The minimum required length of the iterable.
@param defaults: A sequence of default values to pad shorter iterables.
@param extraItems: Controls what to do with iterables longer than
minlength + len(defaults).
- If extraItems is None, the extra items are ignored.
- else if bool(extraItems) is True, the extra items are packed in
a tuple which is appended to the returned iterable. An empty
tuple is appended if there are no extra items.
- else a ValueError is thrown (longer iterables are not acceptable).
'''
# maximum sequence length (without considering extraItems)
maxLength = minLength + len(defaults)
from itertools import islice
def closure(iterable):
iterator = iter(iterable) # make sure you can handle any iterable
padded = list(islice(iterator,maxLength))
if len(padded) < maxLength:
# extend by the slice of the missing defaults
for default in islice(defaults, len(padded) - minLength,
maxLength - minLength):
padded.append(default)
if len(padded) < maxLength:
raise ValueError("unpack iterable of smaller size")
if extraItems: # put the rest elements in a tuple
padded.append(tuple(iterator))
elif extraItems is None: # silently ignore the rest of the iterable
pass
else: # should not have more elements
try: iterator.next()
except StopIteration: pass
else: raise ValueError("unpack iterable of larger size")
# try to return the same type as the original iterable;
itype = type(iterable)
if itype != list:
try: padded = itype(padded)
except TypeError: pass
return padded
return closure
|
Examples of usage could be: - 3D code that handles 2D objects as well for fixed value of the third dimension. - normalizing records and message of varying verbosity - and others I might never even think of.
simpler variable-length tuple unpacking. How about this simpler solution:
even simpler variable-length tuple unpacking. I can reduce that function even more:
One of the examples in my first posting has a typo. It should be:
Greg Jorgensen PDXperts LLC Portland, Oregon USA
Not quite the same. There are several simplifications and differences in your solutions:
-Is this a bug or a feature ?
(1, 2, 'e', None, None)
I would expect the answer to be (1,2,'c','d','e'). Essentially you assume that all the elements of t are optional, which corresponds to setting minLength=0 in my solution. However if len(defaults) < n, you pad the missing defaults with None.
Extra items (i.e if len(t) > n) are always ignored; this corresponds to calling mine with extraItems=None, but the other options (extraItems=True or extraItems=False) are also useful in practice.
It cannot be used with arbitrary iterables since it uses len(t).
It's probably slower as it creates longer than required lists and then truncates them.
George
Re: Not quite the same. > There are several simplifications and differences in your solutions:
Yes. I intended to show a general technique only.
Feature. My function always returns a tuple of n elements. If the tuple passed to it (t) is longer than n, the first n elements are returned. If the tuple t is shorter than n, the missing elements are filled first with the corresponding elements from default, and then with None if there are no defaults.
Obviously it would be easy to change my function to behave differently if that's what's wanted:
would have the effect you describe.
I've actually had a need for that kind of thing several times. A similar case is the pad/justify a string problem in languages that don't have functions to do it. Some programmers write loops to add the pad character. Another way is to just prepend/append the maximum padding to the string and truncate the result.
The problem I thought we were solving was generating fixed-length tuples from a list (or iterable) of variable-length tuples, so the tuples could be unpacked.
If the extra items are needed my function would need some additional functionality. The extra items are still there in list(t)[n:].
It could if it converted t to a list first:
I guess that might be relevant for some applications, but I would be surprised if one line of code that uses only simple operations on built-in types is slower than a function that includes multiple conditional tests, a loop, exceptions, and a closure.
Again, the point of posting my recipe is to show a technique, not a solution to every possible problem. In some circumstances a more general solution like yours would be appropriate.
Greg Jorgensen
>
>
Yes, the output is a generator of fixed-length tuples, but with the possibility of constraining the set of valid inputs, instead of accepting all variable-length iterables. More specifically, the problem was to simulate the argument passing rule used for callables, as in the following syntactically incorrect examples:
In the first example, only iterables of length 2 or 3 would be accepted, so a ValueError would be raised when trying to unpack (6,7,8,9,10). In the second, iterables of length greater or equal to 2 are valid, and the output would be length-4 tuples, with the last element being also a tuple.
>
>
One more argument for the last point ;-)
>
The closure is created only once for a given binding of the parameters; it can then be used for iterating over different loops as a normal function. Of course, it's trivial to have a function pad(iterable,minLength,defaults,extraItems) instead of a padfactory(minLength,defaults,extraItems) if this is preferable.
The exceptions are there for enforcing the constraints I mentioned above. Besides, the first try/except block is entered only if extraItems is False and the second attempts to return an instance of the same type as the input instead of returning always a tuple (this may or may not be desirable).
Perhaps you're right in that in most practical cases with small input sequences, the one-liner would be as fast or even faster than the larger function. I was commenting more on the algorithmic performance, especially as len(t) >> n, rather than the expected in real cases.
simple unpack idioms. Since the number of variables on the left side of the assignment are known at compile time, typical cases can be handled without much fuss.
Unpacking from a list or tuple that may be too long:
Unpacking and catching all excess elements into a tuple:
Unpacking from a list/tuple that may be too short:
Etc. There are plenty of variations on this, but the key point is that the number of items you need for assignment is known at compile time; it doesn't vary at runtime.
There are also some functions in itertools for doing things like this, and they work on any iterable: