Welcome, guest | Sign In | My Account | Store | Cart

Dictionary that remembers insertion order. Useful for editing XML files and config files where there are (key, value) pairs but the original order should be preserved. Similar to ordered dicts in PHP.

Python, 67 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
from collections import MutableMapping

class OrderedDict(dict, MutableMapping):

    # Methods with direct access to underlying attributes

    def __init__(self, *args, **kwds):
        if len(args) > 1:
            raise TypeError('expected at 1 argument, got %d', len(args))
        if not hasattr(self, '_keys'):
            self._keys = []
        self.update(*args, **kwds)

    def clear(self):
        del self._keys[:]
        dict.clear(self)

    def __setitem__(self, key, value):
        if key not in self:
            self._keys.append(key)
        dict.__setitem__(self, key, value)

    def __delitem__(self, key):
        dict.__delitem__(self, key)
        self._keys.remove(key)

    def __iter__(self):
        return iter(self._keys)

    def __reversed__(self):
        return reversed(self._keys)

    def popitem(self):
        if not self:
            raise KeyError
        key = self._keys.pop()
        value = dict.pop(self, key)
        return key, value

    def __reduce__(self):
        items = [[k, self[k]] for k in self]
        inst_dict = vars(self).copy()
        inst_dict.pop('_keys', None)
        return (self.__class__, (items,), inst_dict)

    # Methods with indirect access via the above methods

    setdefault = MutableMapping.setdefault
    update = MutableMapping.update
    pop = MutableMapping.pop
    keys = MutableMapping.keys
    values = MutableMapping.values
    items = MutableMapping.items

    def __repr__(self):
        pairs = ', '.join(map('%r: %r'.__mod__, self.items()))
        return '%s({%s})' % (self.__class__.__name__, pairs)

    def copy(self):
        return self.__class__(self)

    @classmethod
    def fromkeys(cls, iterable, value=None):
        d = cls()
        for key in iterable:
            d[key] = value
        return d

Overview

Proposed addition to Py2.7 and Py3.1. The code also runs under 2.6 and 3.0. See the PEP at: http://www.python.org/dev/peps/pep-0372/

Implements only the standard dict API (like that in UserDict) with no additional new methods. So, it has a zero learning curve and no burden on human memory. It is simply a dictionary that remembers insertion order.

Approach

An ordered list of added keys is maintained as the dictionary grows (credit this to Armin Ronacher; my original approach involved a cached sort).

The __reduce__ method is necessary because pickle's protocol 2 and 3 will try to make calls to __setitem__ before the OrderedDict has been initialized (this happens for any dict subclass). Also, the method is necessary so that redundant parts of the instance dictionary don't get saved; only the user's extra attributes and methods are of interest. The method also supports deepcopying.

The __eq__ method is left the same as regular dictionaries instead of trying to special case alternate behaviors depending on the input type. Any such special casing would likely be unintuitive to some users and could introduce Liskov violatons. Better to document that the normal order-insensitive equality test is being used just like regular dictionaries. If someone wants to explicitly make an order-based comparison, it is trivially easy to do with: list(od1)==list(od2).

Performance

Several methods like __contains__() and __getitem__() are not overridden, so their performance is just as fast as a regular dictionary.

Methods like __setitem__ and __delitem__ are overridden but have a fast path depending on whether or not the key already exists.

The pure python version is plenty fast for typical use cases, but some additional speed could be gained from a C version if needed (like we did for the heapq module).

Integration with Other Tools

The above code has been successfully tested for round-tripping with PyYAML:

>>> items = [('one', 1), ('two', 2), ('three',3), ('four',4), ('five',5)]
>>> ytext = yaml.dump(OrderedDict(items))
>>> print ytext
!!python/object/apply:collections.OrderedDict
- - [one, 1]
  - [two, 2]
  - [three, 3]
  - [four, 4]
  - [five, 5]

>>> yaml.load(ytext)
OrderedDict({'one': 1, 'two': 2, 'three': 3, 'four': 4, 'five': 5})

For json, the good news is that json's encoder respects odict's iteration order:

>>> items = [('one', 1), ('two', 2), ('three',3), ('four',4), ('five',5)]
>>> json.dumps(OrderedDict(items))
'{"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}'

The bad news is that the object_hook for json decoders will pass in an already built dictionary so that the order is lost before the object hook sees it:

>>> jtext = '{"one": 1, "two": 2, "three": 3, "four": 4, "five": 5}'
>>> json.loads(jtext, object_hook=OrderedDict)
OrderedDict({u'four': 4, u'three': 3, u'five': 5, u'two': 2, u'one': 1})

1 comment

Joshua Bronson 14 years, 4 months ago  # | flag

The title says this implementation is for Py2.7 and Py3.1. What prevents it from working in Py2.6?

Created by Raymond Hettinger on Wed, 25 Feb 2009 (MIT)
Python recipes (4591)
Raymond Hettinger's recipes (97)

Required Modules

Other Information and Tasks