ActiveState Code

Recipe 389639: Default Dictionary


A dictionary with a default value (such as 0 or [], or whatever you want) for unassigned keys

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import copy

class DefaultDict(dict):
    """Dictionary with a default value for unknown keys."""
    def __init__(self, default):
        self.default = default

    def __getitem__(self, key):
        if key in self: 
            return self.get(key)
        else:
            ## Need copy in case self.default is something like []
            return self.setdefault(key, copy.deepcopy(self.default))

    def __copy__(self):
        copy = DefaultDict(self.default)
        copy.update(self)
        return copy

Discussion

If you assign d = DefaultDict(0) rather than d = {}, you are free to do d[key] += 1, without worrying if d[key] exists yet.

Similarly, after d = DefaultDict([]), you can do d[key].append(item), and it all works out. It wouldn't do to share the [] between keys, so __getitem__ needs to copy the default value.

Comments

  1. 1. At 3:32 a.m. on 28 feb 2005, Andreas Kloss said:

    Should __init__ call dict.__init__? I might be wrong, but I would assume that a __init__ function should by default call the parent class' __init__.

  2. 2. At 1:32 a.m. on 1 mar 2005, Andreas Kloss said:

    Another minor improvement: Simplify your code with kwargs. Using keyword arguments in the constructor, one could simplify the code even more. So, including my first advice, here's the code:

    import copy
    
    class DefaultDict(dict):
        """Dictionary with a default value for unknown keys."""
        def __init__(self, default, **items):
            dict.__init__(self, **items)
            self.default = default
    
        def __getitem__(self, key):
            if key in self:
                return self.get(key)
            else:
                ## Need copy in case self.default is something like []
                return self.setdefault(key, copy.deepcopy(self.default))
    
        def __copy__(self):
            return DefaultDict(self.default, **self)
    
  3. 3. At 6:46 p.m. on 8 mar 2005, Bogdano Arendartchuk said:

    What about "try .. except" instead of "if key in self"?

  4. 4. At 2:35 p.m. on 10 mar 2005, Lloyd Kvam said:

    *kwds has a trap. *kwds is a dict where all of the keys are legal python identifiers. Since we can not count on that in the general case, we can't use that trick here in the __copy__ method. However, the dict constructor accepts an argument which can be a dict or a list of pairs. We do not need *self in the __copy__ method. If the __init__ constructor omits the * before the items parameter, I think everything would work. Or better yet, support both approaches.

    class DefaultDict(dict):
        def __init__(self,default,list_or_dict=None,**kwds):
            super(DefaultDict, self).__init__(list_or_dict)
            self.update(kwds)
            self.default = default
    
        def __getitem__(self, key):
            return self.get(key, self.default)
    
        def __copy__(self):
            return DefaultDict(self.default, self)
    

    I think this works OK.

  5. 5. At 2:46 p.m. on 10 mar 2005, Lloyd Kvam said:

    *the change in __getitem__.* That was accidental. For a quick test I was not using a mutable default and did not bother with saving keys back into self. I did not mean to remove that feature. I was focused on the ** parameter issue.

  6. 6. At 2:26 a.m. on 15 mar 2005, Andreas Kloss said:

    try/except vs. if/then. You are right - try/except would be better in this case, because it conveys more clearly that "usually" we just return dict[key], and use the other way out only as an afterthought.

    Also it is faster in the "usual" case, since setting up the trap is faster than the double lookup to see if the key is in the dict before fetching it.

  7. 7. At 12:44 p.m. on 13 jun 2005, Tomi Silander said:

    Good Recipe. Thanks, just what I need. I think this could be a standard feature of the dictionary, since it is needed for constructing sparse "functions", which, I guess, is one of the key motivations for dictionaries. To be a useful abstraction, the function should know all its values. Currently, using get-method, the caller(!) has to know the default value and this is not good for modularity. Lately, I have been using:

    def sparse_func(d, val):
        return lambda k: d.get(k, val)
    

    but then I loose all the dictionary-operations. I wonder if this "sparse function" viewpoint would be better addressed by adding a __call__ method to the dictionary.

  8. 8. At 4:53 a.m. on 8 dec 2005, Andreas Eisele said:

    Make it faster for simple cases. This recipe is extremely useful, but suffers from the efficiency/generality tradeoff: in many important applications, the default value is immutable (0, '', or None), but calling "in", "setdefault", and "deepcopy" for new entries looks rather wasteful, given that one call to "setdefault" would suffice.

    The following code tries to remedy this (at the expense of some elegance), and also allows additional constructor arguments in the style of the dict() function in 2.3 and later.

    Insertion of defaults w/o copying is about 7 times faster than in the original version.

    Left out the copy method, as I am not sure a simple update is sufficient (wouldn't we have to copy the values recursively?)

    import copy
    
    class DefaultDictBase(dict):
        """Dictionary with a default value for unknown keys."""
        def __init__(self, default=None, *args, **kwargs):
            self.default = default
            self.update(dict(*args, **kwargs))
    
    class DefaultDictCopy(DefaultDictBase):
        """DefaultDict that copies its default value."""
        def __getitem__(self, key):
            if key in self: return self.get(key)
            return self.setdefault(key, copy.deepcopy(self.default))
    
    class DefaultDictNoCopy(DefaultDictBase):
        """DefaultDict that uses its default value as is."""
        def __getitem__(self, key):
            return self.setdefault(key, self.default)
    
    def defaultDict(default=None,*args,**kwargs):
        "create a DefaultDict suitable for the given default"
        if id(default)==id(copy.deepcopy(default)):
            return DefaultDictNoCopy(default,*args,**kwargs)
        return DefaultDictCopy(default,*args,**kwargs)
    
  9. 9. At 3:07 p.m. on 9 jun 2007, Andrew Dalke said:

    use __missing__ in Python 2.5. Python 2.5 added the __missing__ hook to the standard dictionary type "for letting subclasses provide a default value when a key isn't contained in the dictionary. When a key isn't found, the dictionary's __missing__(key) method will be called. "

Sign in to comment