ActiveState Code

Recipe 66315: Case-insensitive Dictionary


A dictionary that has case-insensitive keys.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
class KeyInsensitiveDict:
    """Dictionary, that has case-insensitive keys.
    
    Keys are retained in their original form
    when queried with .keys() or .items().

    Implementation: An internal dictionary maps lowercase
    keys to (key,value) pairs. All key lookups are done
    against the lowercase keys, but all methods that expose
    keys to the user retrieve the original keys."""
    
    def __init__(self, dict=None):
        """Create an empty dictionary, or update from 'dict'."""
        self._dict = {}
        if dict:
            self.update(dict)

    def __getitem__(self, key):
        """Retrieve the value associated with 'key' (in any case)."""
        k = key.lower()
        return self._dict[k][1]

    def __setitem__(self, key, value):
        """Associate 'value' with 'key'. If 'key' already exists, but
        in different case, it will be replaced."""
        k = key.lower()
        self._dict[k] = (key, value)

    def has_key(self, key):
        """Case insensitive test wether 'key' exists."""
        k = key.lower()
        return self._dict.has_key(k)

    def keys(self):
        """List of keys in their original case."""
        return [v[0] for v in self._dict.values()]

    def values(self):
        """List of values."""
        return [v[1] for v in self._dict.values()]

    def items(self):
        """List of (key,value) pairs."""
        return self._dict.values()

    def get(self, key, default=None):
        """Retrieve value associated with 'key' or return default value
        if 'key' doesn't exist."""
        try:
            return self[key]
        except KeyError:
            return default

    def setdefault(self, key, default):
        """If 'key' doesn't exists, associate it with the 'default' value.
        Return value associated with 'key'."""
        if not self.has_key(key):
            self[key] = default
        return self[key]

    def update(self, dict):
        """Copy (key,value) pairs from 'dict'."""
        for k,v in dict.items():
            self[k] = v

    def __repr__(self):
        """String representation of the dictionary."""
        items = ", ".join([("%r: %r" % (k,v)) for k,v in self.items()])
        return "{%s}" % items

    def __str__(self):
        """String representation of the dictionary."""
        return repr(self)

Discussion

This class is useful when associating data with an environment that is case-insensitive, but retains case (such as the Windows filesystem).

The class could be expanded by allowing the user to define a 'translator'-function (defaults to string.lower) that is used to normalize the keys. One concrete example where such feature would be desirable is the IRC protocol, where "[]\~".lower() == "{}|^".

Comments

  1. 1. At 3:30 a.m. on 10 jul 2003, Richie Hindle said:

    Missing methods. You're missing a couple of dictionary methods:

    def __len__(self):
        """Returns the number of (key, value) pairs."""
        return len(self._dict)
    
    def clear(self):
        """Clears the dictionary."""
        self._dict.clear()
    
  2. 2. At 12:17 a.m. on 4 dec 2003, Benjamin Allfree said:

    One more. For testing 'key in dict'

    def __contains__(self, key):
        """Case insensitive test where key is in dict"""
        k = key.lower()
        return k in self._dict
    
  3. 3. At 6:59 p.m. on 14 jul 2005, Chris Hobbs said:

    Class Doesn't Respond as a Dictionary. This class doesn't respond to iteration as a normal dictionary. Here is an example of it failing:

    ===firstly with a normal dictionary

    >>> x = {}
    
    
    
    >>> x['fred'] = 76
    
    
    
    >>> x['joe'] = 98
    
    
    
    >>> for key in x:
    

    ... print key

    ...

    joe

    fred

    =============and then with the one proposed here

    >>> x = KeyInsensitiveDict()
    
    
    
    >>> x['fred'] = 76
    
    
    
    >>> x['joe'] = 98
    
    
    
    >>> for key in x:
    

    ... print key

    Traceback (most recent call last):

    File "", line 1, in ?

    File "CIMCommon.py", line 231, in __getitem__

    k = key.lower()
    

    AttributeError: 'int' object has no attribute 'lower'

    This would seem to be an error with the __getitem__ function.

  4. 4. At 6:46 a.m. on 16 jul 2005, Chris Hobbs said:

    Iterator Improvement. The problem described above of failing with the statement

    for key in dictionary:
    

    can be solved by adding simple __iter__() and next() methods. However, there is still an issue:

    >>> x = KeyInsensitiveDict()
    >>> x['fred'] = 47
    >>> print 'fred' in x
    True
    >>> print 'FRED' in x
    False
    
  5. 5. At 2:18 p.m. on 16 jul 2005, Chris Hobbs said:

    Reworked Version of Algorithm. In order to satisfy the requirements of supporting

    for key in dictionary:
    

    and

    if 'fred' in dictionary:
    

    I have augmented the code of this algorithm as follows. This seems to satisfy at least all of the tests which I have thrown at it:

    # ********************************************
    # class     caselessDict
    # purpose   emulate a normal Python dictionary
    #           but with keys which can accept the
    #           lower() method (typically strings).
    #           Accesses to the dictionary are
    #           case-insensitive but keys returned
    #           from the dictionary are always in
    #           the original case.
    # ********************************************
    
    class caselessDict:
        def __init__(self,inDict=None):
            """Constructor: takes conventional dictionary
               as input (or nothing)"""
            self.dict = {}
            if inDict != None:
                for key in inDict:
                    k = key.lower()
                    self.dict[k] = (key, inDict[key])
            self.keyList = self.dict.keys()
            return
    
        def __iter__(self):
            self.iterPosition = 0
            return(self)
    
        def next(self):
            if self.iterPosition >= len(self.keyList):
                raise StopIteration
            x = self.dict[self.keyList[self.iterPosition]][0]
            self.iterPosition += 1
            return x
    
        def __getitem__(self, key):
            k = key.lower()
            return self.dict[k][1]
    
        def __setitem__(self, key, value):
            k = key.lower()
            self.dict[k] = (key, value)
            self.keyList = self.dict.keys()
    
        def has_key(self, key):
            k = key.lower()
            return k in self.keyList
    
        def __len__(self):
            return len(self.dict)
    
        def keys(self):
            return [v[0] for v in self.dict.values()]
    
        def values(self):
            return [v[1] for v in self.dict.values()]
    
        def items(self):
            return self.dict.values()
    
        def __contains__(self, item):
            return self.dict.has_key(item.lower())
    
        def __repr__(self):
            items = ", ".join([("%r: %r" % (k,v)) for k,v in self.items()])
            return "{%s}" % items
    
        def __str__(self):
            return repr(self)
    
    # *****************************************
    # Test Code
    # *****************************************
    
    if __name__ == '__main__':
        foundError = False
    
        # firstly create an empty caselessDict
    

    (comment continued...)

  6. 6. At 2:18 p.m. on 16 jul 2005, Chris Hobbs said:

    (...continued from previous comment)

        x = caselessDict()
        x['frEd'] = 76
        x['jOe'] = 92
        x['bERT'] = 54
        x['Bert'] = 53
        if x['bert'] != 53:
            print "Error 1"
            foundError = True
        shouldBe = [ 'Bert', 'jOe', 'frEd' ]
        for key in x:
            if not key in shouldBe:
                print "Error 2"
                foundError = True
            else:
                shouldBe.remove(key)
        if len(shouldBe) != 0:
            print "Error 2a"
            foundError = True
        if not 'frEd' in x:
            print "Error 3"
            foundError = True
        if not 'fRed' in x:
            print "Error 4"
            foundError = True
        if 'fReda' in x:
            print "Error 5"
            foundError = True
        y = x.keys()
        if len(y) != 3:
            print "Error 6"
            foundError = True
        for yy in y:
            if (yy != 'Bert') and (yy != 'jOe') and (yy != 'frEd'):
                print "Error 7"
                foundError = True
        if x['FRED'] != 76:
            print "Error 8"
            foundError = True
        if x['joe'] != 92:
            print "Error 9"
            foundError = True
    
        # then create a caselessDict from an existing dictionary
    
        y = { 'frEd' : 76, 'jOe' : 92, 'Bert' : 53 }
        x = caselessDict(y)
        if x['bert'] != 53:
            print "Error 10"
            foundError = True
        shouldBe = [ 'Bert', 'jOe', 'frEd' ]
        for key in x:
            if not key in shouldBe:
                print "Error 11"
                foundError = True
            else:
                shouldBe.remove(key)
        if len(shouldBe) != 0:
            print "Error 11a"
            foundError = True
        if not 'frEd' in x:
            print "Error 3"
            foundError = True
        if not 'fRed' in x:
            print "Error 12"
            foundError = True
        y = x.keys()
        if len(y) != 3:
            print "Error 13"
            foundError = True
        for yy in y:
            if (yy != 'Bert') and (yy != 'jOe') and (yy != 'frEd'):
                print "Error 14"
                foundError = True
        if x['FRED'] != 76:
            print "Error 15"
            foundError = True
        if x['joe'] != 92:
            print "Error 16"
            foundError = True
        if foundError == False:
            print "No errors found"
    
  7. 7. At 10:59 a.m. on 5 sep 2005, Wolf Siepen said:

    Get method missing. To round off things:

    def get(self, key, alt):
        if self.has_key(key):
            return self.__getitem__(key)
        return alt
    
  8. 8. At 12:16 p.m. on 23 nov 2005, Jeff Donner said:

    Another approach, with 2.4. This relies on the fact that in 2.4 you can derive from 'dict' like any other class. Also, note that this /changes/ the key, to your canonical (here, lowercase) form; it doesn't leave it in the dictionary as given while continuing to match it case-insensitively as the previous examples do:

    class CaselessDict(dict):
        def __init__(self, other=None):
            if other:
                # Doesn't do keyword args
                if isinstance(other, dict):
                    for k,v in other.items():
                        dict.__setitem__(self, k.lower(), v)
                else:
                    for k,v in other:
                        dict.__setitem__(self, k.lower(), v)
    
        def __getitem__(self, key):
            return dict.__getitem__(self, key.lower())
    
        def __setitem__(self, key, value):
            dict.__setitem__(self, key.lower(), value)
    
        def __contains__(self, key):
            return dict.__contains__(self, key.lower())
    
        def has_key(self, key):
            return dict.has_key(self, key.lower())
    
        def get(self, key, def_val=None):
            return dict.get(self, key.lower(), def_val)
    
        def setdefault(self, key, def_val=None):
            return dict.setdefault(self, key.lower(), def_val)
    
        def update(self, other):
            for k,v in other.items():
                dict.__setitem__(self, k.lower(), v)
    
        def fromkeys(self, iterable, value=None):
            d = CaselessDict()
            for k in iterable:
                dict.__setitem__(d, k.lower(), value)
            return d
    
        def pop(self, key, def_val=None):
            return dict.pop(self, key.lower(), def_val)
    

    The idea is to override just what you must and leave as much as possible to the base class. As noted, it doesn't do keyword initialization. Other than that it should be pretty complete, and hopefully no slower than it must be.

Sign in to comment