Welcome, guest | Sign In | My Account | Store | Cart

Fast factory funtion for Py2.5's defaultdict making simple use of itertools. Equivalent to lambda:some_constant.

Python, 20 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
>>> from collections import defaultdict
>>> from itertools import repeat
      
>>> d = defaultdict(repeat('').next)  # default to an empty string
>>> d['abc'] += 'more text'
>>> d['abc']
'more text'


>>> d = defaultdict(repeat('<missing>').next)	# default to 'missing'
>>> d.update(name='John', action='ran')
>>> '%(name)s %(action)s to %(object)s' % d
'John ran to <missing>'


>>> d = defaultdict(repeat(0).next)  # default to zero
>>> for char in 'abracadabra':
	d[char] += 1
>>> d.items()
[('a', 5), ('r', 2), ('b', 2), ('c', 1), ('d', 1)]

Using itertools.repeat(const).next is a pure C, fast version of lambda:const. This takes full advantage of the speed and convenience benefits of the new collections.defaultdict().

2 comments

Calvin Spealman 17 years, 1 month ago  # | flag

just wrap it up. >>> timeit.Timer('a()', 'a = lambda:"foobarbaz"').timeit() 0.75941300392150879

>>> timeit.Timer('a()', 'import itertools;a = itertools.repeat("foobarbaz").next').timeit()
0.50646805763244629
>>> timeit.Timer('a()', 'import itertools;a = (lambda s:itertools.repeat(s).next)("foobarbaz")').timeit()
0.40992307662963867

Surprisingly, these are all consistent results, even the last two! I really can't explain why, but creating a function you can pass something to to do the work, rather than using itertools.repeat().next yourself is actually faster on the actual calls!

Raymond Hettinger (author) 17 years, 1 month ago  # | flag

Meaningful timing results. The timeit module can be tricky to use. Try specifying exactly how many repetitions you want and run the test multiple times in the a single script. Try to get your system as quiet as possible (switch-off other processes and net activity). Then, you with have a fighting chance of getting results that make sense:

from timeit import Timer

setup = '''
import itertools
a = lambda: 'foobarbaz'
b = itertools.repeat('foobarbaz').next
c = (lambda s:itertools.repeat(s).next)("foobarbaz")
'''

for i in (1,2,3,4,5):
    for stmt in 'a() b() c()'.split():
        print stmt, min(Timer(stmt, setup).repeat(7, 1000000))
    print
Created by Raymond Hettinger on Mon, 12 Feb 2007 (PSF)
Python recipes (4591)
Raymond Hettinger's recipes (97)

Required Modules

  • (none specified)

Other Information and Tasks