Welcome, guest | Sign In | My Account | Store | Cart

Python tuple unpacking works only for fixed length sequences, that is one cannot write something like: for (x,y,z=0,*rest) in [(1,2,3), (4,5), (6,7,8,9,10)]: print x,y,z,rest

This recipe returns a pad function that implements this functionality (albeit with less concise syntax). Using the recipe, the example above then can be written as: pad = padfactory(minLength=2,defaults=(0,),extraItems=True) for x,y,z,rest in map(pad, [(1,2,3), (4,5), (6,7,8,9,10)]): print x,y,z,rest

Python, 54 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
def demo():
    for firstname,lastname,age,skills in map(
                padfactory(minLength=1, defaults=["",0], extraItems=False),
                [("Tony", "Caley", 23),
                 ("Mike", "Balboa"),
                 ["Aristotle"],
                 ("Nick", "Fulton", 31, "python", "C++", "oracle"),]):
        print "%s %s (%d): %s" % (firstname, lastname, age,
                                  ' '.join(skills) or None)

def padfactory(minLength=0, defaults=(), extraItems=False):
    '''Return a function that can pad variable-length iterables.

    The returned function f:iterable -> iterable attempts to return an
    iterable of the same type as the original; if this fails it returns a list.

    @param minLength: The minimum required length of the iterable.
    @param defaults: A sequence of default values to pad shorter iterables.
    @param extraItems: Controls what to do with iterables longer than
        minlength + len(defaults).
            - If extraItems is None, the extra items are ignored.
            - else if bool(extraItems) is True, the extra items are packed in
              a tuple which is appended to the returned iterable. An empty
              tuple is appended if there are no extra items.
            - else a ValueError is thrown (longer iterables are not acceptable).
    '''
    # maximum sequence length (without considering extraItems)
    maxLength = minLength + len(defaults)
    from itertools import islice
    def closure(iterable):
        iterator = iter(iterable)   # make sure you can handle any iterable
        padded = list(islice(iterator,maxLength))
        if len(padded) < maxLength:
            # extend by the slice of the missing defaults
            for default in islice(defaults, len(padded) - minLength,
                                  maxLength - minLength):
                padded.append(default)
        if len(padded) < maxLength:
            raise ValueError("unpack iterable of smaller size")
        if extraItems:               # put the rest elements in a tuple
            padded.append(tuple(iterator))
        elif extraItems is None:     # silently ignore the rest of the iterable
            pass
        else:                        # should not have more elements
            try: iterator.next()
            except StopIteration: pass
            else: raise ValueError("unpack iterable of larger size")
        # try to return the same type as the original iterable;
        itype = type(iterable)
        if itype != list:
            try: padded = itype(padded)
            except TypeError: pass
        return padded
    return closure

Examples of usage could be: - 3D code that handles 2D objects as well for fixed value of the third dimension. - normalizing records and message of varying verbosity - and others I might never even think of.

6 comments

Greg Jorgensen 16 years, 8 months ago  # | flag

simpler variable-length tuple unpacking. How about this simpler solution:

def vunpack(t, n, defaults=[]):
    ''' variable-length tuple unpacker

        by Greg Jorgensen, gregj@pdxperts.com, 3/31/2005

        return a tuple of the first n items from iterable t
        missing elements are populated from defaults if supplied,
        otherwise missing elements are None

        tests:
          vunpack((1,2,3,4), 3) => (1,2,3)
          vunpack((1,), 3) => (1,None,None)
          vunpack((1,2), 5, ('a','b','c','d','e')) => (1,2,'c','d','e')

        examples:
          (x,y) = vunpack((1,2,3,4), 2, (43,44)) => x=1, y=2
          (x,y) = vunpack((,), 2, (43,44)) => x=43, y=44
          for (x,y,z) in [vunpack(t, 3, (0,0,0)) for t in [(1,2,3),(4,5),(6,7,8,9,10)]]:
              print x, y, z
    '''

    result = list(defaults) + [None] * n
    result[0: len(t)] = list(t)
    return tuple(result[0: n])
Greg Jorgensen 16 years, 8 months ago  # | flag

even simpler variable-length tuple unpacking. I can reduce that function even more:

def vunpack(t, n, defaults=[]):
    return tuple( (list(t) + list(defaults)[len(t):] + [None] * n)[0:n] )

One of the examples in my first posting has a typo. It should be:

# x,y = vunpack((), 2, (43,44)) => x=43, y=44

Greg Jorgensen PDXperts LLC Portland, Oregon USA

George Sakkis (author) 16 years, 8 months ago  # | flag

Not quite the same. There are several simplifications and differences in your solutions:

-Is this a bug or a feature ?

>>> vunpack((1,2), 5, ('c','d','e'))

(1, 2, 'e', None, None)

I would expect the answer to be (1,2,'c','d','e'). Essentially you assume that all the elements of t are optional, which corresponds to setting minLength=0 in my solution. However if len(defaults) < n, you pad the missing defaults with None.

  • Extra items (i.e if len(t) > n) are always ignored; this corresponds to calling mine with extraItems=None, but the other options (extraItems=True or extraItems=False) are also useful in practice.

  • It cannot be used with arbitrary iterables since it uses len(t).

  • It's probably slower as it creates longer than required lists and then truncates them.

George

Greg Jorgensen 16 years, 8 months ago  # | flag

Re: Not quite the same. > There are several simplifications and differences in your solutions:

Yes. I intended to show a general technique only.

-Is this a bug or a feature ?

>>> vunpack((1,2), 5, ('c','d','e'))
  

(1, 2, 'e', None, None)

Feature. My function always returns a tuple of n elements. If the tuple passed to it (t) is longer than n, the first n elements are returned. If the tuple t is shorter than n, the missing elements are filled first with the corresponding elements from default, and then with None if there are no defaults.

Obviously it would be easy to change my function to behave differently if that's what's wanted:

return tuple((list(t) + list(defaults) + [None] * n)[0:n])

would have the effect you describe.

I've actually had a need for that kind of thing several times. A similar case is the pad/justify a string problem in languages that don't have functions to do it. Some programmers write loops to add the pad character. Another way is to just prepend/append the maximum padding to the string and truncate the result.

Extra items (i.e if len(t) > n) are always ignored; this corresponds to

calling mine with extraItems=None, but the other options (extraItems=True

or extraItems=False) are also useful in practice.

The problem I thought we were solving was generating fixed-length tuples from a list (or iterable) of variable-length tuples, so the tuples could be unpacked.

If the extra items are needed my function would need some additional functionality. The extra items are still there in list(t)[n:].

It cannot be used with arbitrary iterables since it uses len(t).

It could if it converted t to a list first:

a = list(t)
return tuple((a + list(defaults)[len(a):] + [None] * n)[0:n])

It's probably slower as it creates longer than required lists and then

truncates them.

I guess that might be relevant for some applications, but I would be surprised if one line of code that uses only simple operations on built-in types is slower than a function that includes multiple conditional tests, a loop, exceptions, and a closure.

Again, the point of posting my recipe is to show a technique, not a solution to every possible problem. In some circumstances a more general solution like yours would be appropriate.

Greg Jorgensen

George Sakkis (author) 16 years, 8 months ago  # | flag

Extra items (i.e if len(t) > n) are always ignored; this corresponds to

calling mine with extraItems=None, but the other options (extraItems=True

or extraItems=False) are also useful in practice.

>

>

The problem I thought we were solving was generating fixed-length tuples

from a list (or iterable) of variable-length tuples, so the tuples could be

unpacked.

Yes, the output is a generator of fixed-length tuples, but with the possibility of constraining the set of valid inputs, instead of accepting all variable-length iterables. More specifically, the problem was to simulate the argument passing rule used for callables, as in the following syntactically incorrect examples:

for (x,y,z=0) in [(1,2,3), (4,5), (6,7,8,9,10)]:
    print x,y,z
for (x,y,z=0,*rest) in [(1,2,3), (4,5), (6,7,8,9,10)]:
    print x,y,z,rest

In the first example, only iterables of length 2 or 3 would be accepted, so a ValueError would be raised when trying to unpack (6,7,8,9,10). In the second, iterables of length greater or equal to 2 are valid, and the output would be length-4 tuples, with the last element being also a tuple.

It cannot be used with arbitrary iterables since it uses len(t).

>

It could if it converted t to a list first:

>

a = list(t)
  
  return tuple((a + list(defaults)[len(a):] + [None] * n)[0:n])
  

One more argument for the last point ;-)

It's probably slower as it creates longer than required lists and then

truncates them.

>

I guess that might be relevant for some applications, but I would be

surprised if one line of code that uses only simple operations on built-in

types is slower than a function that includes multiple conditional tests, a

loop, exceptions, and a closure.

  • The closure is created only once for a given binding of the parameters; it can then be used for iterating over different loops as a normal function. Of course, it's trivial to have a function pad(iterable,minLength,defaults,extraItems) instead of a padfactory(minLength,defaults,extraItems) if this is preferable.

  • The exceptions are there for enforcing the constraints I mentioned above. Besides, the first try/except block is entered only if extraItems is False and the second attempts to return an instance of the same type as the input instead of returning always a tuple (this may or may not be desirable).

  • Perhaps you're right in that in most practical cases with small input sequences, the one-liner would be as fast or even faster than the larger function. I was commenting more on the algorithmic performance, especially as len(t) >> n, rather than the expected in real cases.

Greg Jorgensen 16 years, 8 months ago  # | flag

simple unpack idioms. Since the number of variables on the left side of the assignment are known at compile time, typical cases can be handled without much fuss.

Unpacking from a list or tuple that may be too long:

>>> t = range(10)
>>> a,b,c = t[:3]
>>> a, b, c
(0, 1, 2)

Unpacking and catching all excess elements into a tuple:

>>> t = range(10)
>>> a,b,c,z = list(t[:3]) + [t[4:]]
>>> a,b,c,z
(0, 1, 2, [4, 5, 6, 7, 8, 9])

Unpacking from a list/tuple that may be too short:

>>> t = range(2)
>>> a,b,c,d,e = (t + [99] * 5)[:5]
>>> a,b,c,d,e
(0, 1, 99, 99, 99)

Etc. There are plenty of variations on this, but the key point is that the number of items you need for assignment is known at compile time; it doesn't vary at runtime.

There are also some functions in itertools for doing things like this, and they work on any iterable:

>>> import itertools
>>> t = range(10)
>>> a,b,c,d,e = itertools.islice(t, 5)
>>> a,b,c,d,e
(0, 1, 2, 3, 4)
Created by George Sakkis on Thu, 24 Mar 2005 (PSF)
Python recipes (4591)
George Sakkis's recipes (26)

Required Modules

Other Information and Tasks