Dictionary that remembers insertion order. Useful for editing XML files and config files where there are (key, value) pairs but the original order should be preserved. Similar to ordered dicts in PHP.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | from collections import MutableMapping
class OrderedDict(dict, MutableMapping):
# Methods with direct access to underlying attributes
def __init__(self, *args, **kwds):
if len(args) > 1:
raise TypeError('expected at 1 argument, got %d', len(args))
if not hasattr(self, '_keys'):
self._keys = []
self.update(*args, **kwds)
def clear(self):
del self._keys[:]
dict.clear(self)
def __setitem__(self, key, value):
if key not in self:
self._keys.append(key)
dict.__setitem__(self, key, value)
def __delitem__(self, key):
dict.__delitem__(self, key)
self._keys.remove(key)
def __iter__(self):
return iter(self._keys)
def __reversed__(self):
return reversed(self._keys)
def popitem(self):
if not self:
raise KeyError
key = self._keys.pop()
value = dict.pop(self, key)
return key, value
def __reduce__(self):
items = [[k, self[k]] for k in self]
inst_dict = vars(self).copy()
inst_dict.pop('_keys', None)
return (self.__class__, (items,), inst_dict)
# Methods with indirect access via the above methods
setdefault = MutableMapping.setdefault
update = MutableMapping.update
pop = MutableMapping.pop
keys = MutableMapping.keys
values = MutableMapping.values
items = MutableMapping.items
def __repr__(self):
pairs = ', '.join(map('%r: %r'.__mod__, self.items()))
return '%s({%s})' % (self.__class__.__name__, pairs)
def copy(self):
return self.__class__(self)
@classmethod
def fromkeys(cls, iterable, value=None):
d = cls()
for key in iterable:
d[key] = value
return d
|
Overview
Proposed addition to Py2.7 and Py3.1. The code also runs under 2.6 and 3.0. See the PEP at: http://www.python.org/dev/peps/pep-0372/
Implements only the standard dict API (like that in UserDict) with no additional new methods. So, it has a zero learning curve and no burden on human memory. It is simply a dictionary that remembers insertion order.
Approach
An ordered list of added keys is maintained as the dictionary grows (credit this to Armin Ronacher; my original approach involved a cached sort).
The __reduce__ method is necessary because pickle's protocol 2 and 3 will try to make calls to __setitem__ before the OrderedDict has been initialized (this happens for any dict subclass). Also, the method is necessary so that redundant parts of the instance dictionary don't get saved; only the user's extra attributes and methods are of interest. The method also supports deepcopying.
The __eq__ method is left the same as regular dictionaries instead of trying to special case alternate behaviors depending on the input type. Any such special casing would likely be unintuitive to some users and could introduce Liskov violatons. Better to document that the normal order-insensitive equality test is being used just like regular dictionaries. If someone wants to explicitly make an order-based comparison, it is trivially easy to do with: list(od1)==list(od2).
Performance
Several methods like __contains__() and __getitem__() are not overridden, so their performance is just as fast as a regular dictionary.
Methods like __setitem__ and __delitem__ are overridden but have a fast path depending on whether or not the key already exists.
The pure python version is plenty fast for typical use cases, but some additional speed could be gained from a C version if needed (like we did for the heapq module).
Integration with Other Tools
The above code has been successfully tested for round-tripping with PyYAML:
>>> items = [('one', 1), ('two', 2), ('three',3), ('four',4), ('five',5)]
>>> ytext = yaml.dump(OrderedDict(items))
>>> print ytext
!!python/object/apply:collections.OrderedDict
- - [one, 1]
- [two, 2]
- [three, 3]
- [four, 4]
- [five, 5]
>>> yaml.load(ytext)
OrderedDict({'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5})
For json, the good news is that json's encoder respects odict's iteration order:
>>> items = [('one', 1), ('two', 2), ('three',3), ('four',4), ('five',5)]
>>> json.dumps(OrderedDict(items))
'{"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}'
The bad news is that the object_hook for json decoders will pass in an already built dictionary so that the order is lost before the object hook sees it:
>>> jtext = '{"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}'
>>> json.loads(jtext, object_hook=OrderedDict)
OrderedDict({u'four': 4, u'three': 3, u'five': 5, u'two': 2, u'one': 1})
The title says this implementation is for Py2.7 and Py3.1. What prevents it from working in Py2.6?