A dictionary with a default value (such as 0 or [], or whatever you want) for unassigned keys
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | import copy
class DefaultDict(dict):
"""Dictionary with a default value for unknown keys."""
def __init__(self, default):
self.default = default
def __getitem__(self, key):
if key in self:
return self.get(key)
else:
## Need copy in case self.default is something like []
return self.setdefault(key, copy.deepcopy(self.default))
def __copy__(self):
copy = DefaultDict(self.default)
copy.update(self)
return copy
|
If you assign d = DefaultDict(0) rather than d = {}, you are free to do d[key] += 1, without worrying if d[key] exists yet.
Similarly, after d = DefaultDict([]), you can do d[key].append(item), and it all works out. It wouldn't do to share the [] between keys, so __getitem__ needs to copy the default value.
Tags: shortcuts
Should __init__ call dict.__init__? I might be wrong, but I would assume that a __init__ function should by default call the parent class' __init__.
Another minor improvement: Simplify your code with kwargs. Using keyword arguments in the constructor, one could simplify the code even more. So, including my first advice, here's the code:
What about "try .. except" instead of "if key in self"?
*kwds has a trap. *kwds is a dict where all of the keys are legal python identifiers. Since we can not count on that in the general case, we can't use that trick here in the __copy__ method. However, the dict constructor accepts an argument which can be a dict or a list of pairs. We do not need *self in the __copy__ method. If the __init__ constructor omits the * before the items parameter, I think everything would work. Or better yet, support both approaches.
I think this works OK.
*the change in __getitem__.* That was accidental. For a quick test I was not using a mutable default and did not bother with saving keys back into self. I did not mean to remove that feature. I was focused on the ** parameter issue.
try/except vs. if/then. You are right - try/except would be better in this case, because it conveys more clearly that "usually" we just return dict[key], and use the other way out only as an afterthought.
Also it is faster in the "usual" case, since setting up the trap is faster than the double lookup to see if the key is in the dict before fetching it.
Good Recipe. Thanks, just what I need. I think this could be a standard feature of the dictionary, since it is needed for constructing sparse "functions", which, I guess, is one of the key motivations for dictionaries. To be a useful abstraction, the function should know all its values. Currently, using get-method, the caller(!) has to know the default value and this is not good for modularity. Lately, I have been using:
but then I loose all the dictionary-operations. I wonder if this "sparse function" viewpoint would be better addressed by adding a __call__ method to the dictionary.
Make it faster for simple cases. This recipe is extremely useful, but suffers from the efficiency/generality tradeoff: in many important applications, the default value is immutable (0, '', or None), but calling "in", "setdefault", and "deepcopy" for new entries looks rather wasteful, given that one call to "setdefault" would suffice.
The following code tries to remedy this (at the expense of some elegance), and also allows additional constructor arguments in the style of the dict() function in 2.3 and later.
Insertion of defaults w/o copying is about 7 times faster than in the original version.
Left out the copy method, as I am not sure a simple update is sufficient (wouldn't we have to copy the values recursively?)
use __missing__ in Python 2.5. Python 2.5 added the __missing__ hook to the standard dictionary type "for letting subclasses provide a default value when a key isn't contained in the dictionary. When a key isn't found, the dictionary's __missing__(key) method will be called. "