Welcome, guest | Sign In | My Account | Store | Cart

A dictionary that has case-insensitive keys.

Python, 73 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
class KeyInsensitiveDict:
    """Dictionary, that has case-insensitive keys.
    
    Keys are retained in their original form
    when queried with .keys() or .items().

    Implementation: An internal dictionary maps lowercase
    keys to (key,value) pairs. All key lookups are done
    against the lowercase keys, but all methods that expose
    keys to the user retrieve the original keys."""
    
    def __init__(self, dict=None):
        """Create an empty dictionary, or update from 'dict'."""
        self._dict = {}
        if dict:
            self.update(dict)

    def __getitem__(self, key):
        """Retrieve the value associated with 'key' (in any case)."""
        k = key.lower()
        return self._dict[k][1]

    def __setitem__(self, key, value):
        """Associate 'value' with 'key'. If 'key' already exists, but
        in different case, it will be replaced."""
        k = key.lower()
        self._dict[k] = (key, value)

    def has_key(self, key):
        """Case insensitive test wether 'key' exists."""
        k = key.lower()
        return self._dict.has_key(k)

    def keys(self):
        """List of keys in their original case."""
        return [v[0] for v in self._dict.values()]

    def values(self):
        """List of values."""
        return [v[1] for v in self._dict.values()]

    def items(self):
        """List of (key,value) pairs."""
        return self._dict.values()

    def get(self, key, default=None):
        """Retrieve value associated with 'key' or return default value
        if 'key' doesn't exist."""
        try:
            return self[key]
        except KeyError:
            return default

    def setdefault(self, key, default):
        """If 'key' doesn't exists, associate it with the 'default' value.
        Return value associated with 'key'."""
        if not self.has_key(key):
            self[key] = default
        return self[key]

    def update(self, dict):
        """Copy (key,value) pairs from 'dict'."""
        for k,v in dict.items():
            self[k] = v

    def __repr__(self):
        """String representation of the dictionary."""
        items = ", ".join([("%r: %r" % (k,v)) for k,v in self.items()])
        return "{%s}" % items

    def __str__(self):
        """String representation of the dictionary."""
        return repr(self)

This class is useful when associating data with an environment that is case-insensitive, but retains case (such as the Windows filesystem).

The class could be expanded by allowing the user to define a 'translator'-function (defaults to string.lower) that is used to normalize the keys. One concrete example where such feature would be desirable is the IRC protocol, where "[]\~".lower() == "{}|^".

9 comments

Richie Hindle 18 years, 10 months ago  # | flag

Missing methods. You're missing a couple of dictionary methods:

def __len__(self):
    """Returns the number of (key, value) pairs."""
    return len(self._dict)

def clear(self):
    """Clears the dictionary."""
    self._dict.clear()
Benjamin Allfree 18 years, 5 months ago  # | flag

One more. For testing 'key in dict'

def __contains__(self, key):
    """Case insensitive test where key is in dict"""
    k = key.lower()
    return k in self._dict
Chris Hobbs 16 years, 10 months ago  # | flag

Class Doesn't Respond as a Dictionary. This class doesn't respond to iteration as a normal dictionary. Here is an example of it failing:

===firstly with a normal dictionary

>>> x = {}



>>> x['fred'] = 76



>>> x['joe'] = 98



>>> for key in x:

... print key

...

joe

fred

=============and then with the one proposed here

>>> x = KeyInsensitiveDict()



>>> x['fred'] = 76



>>> x['joe'] = 98



>>> for key in x:

... print key

Traceback (most recent call last):

File "", line 1, in ?

File "CIMCommon.py", line 231, in __getitem__

k = key.lower()

AttributeError: 'int' object has no attribute 'lower'

This would seem to be an error with the __getitem__ function.

Chris Hobbs 16 years, 10 months ago  # | flag

Iterator Improvement. The problem described above of failing with the statement

for key in dictionary:

can be solved by adding simple __iter__() and next() methods. However, there is still an issue:

>>> x = KeyInsensitiveDict()
>>> x['fred'] = 47
>>> print 'fred' in x
True
>>> print 'FRED' in x
False
Chris Hobbs 16 years, 10 months ago  # | flag

Reworked Version of Algorithm. In order to satisfy the requirements of supporting

for key in dictionary:

and

if 'fred' in dictionary:

I have augmented the code of this algorithm as follows. This seems to satisfy at least all of the tests which I have thrown at it:

# ********************************************
# class     caselessDict
# purpose   emulate a normal Python dictionary
#           but with keys which can accept the
#           lower() method (typically strings).
#           Accesses to the dictionary are
#           case-insensitive but keys returned
#           from the dictionary are always in
#           the original case.
# ********************************************

class caselessDict:
    def __init__(self,inDict=None):
        """Constructor: takes conventional dictionary
           as input (or nothing)"""
        self.dict = {}
        if inDict != None:
            for key in inDict:
                k = key.lower()
                self.dict[k] = (key, inDict[key])
        self.keyList = self.dict.keys()
        return

    def __iter__(self):
        self.iterPosition = 0
        return(self)

    def next(self):
        if self.iterPosition >= len(self.keyList):
            raise StopIteration
        x = self.dict[self.keyList[self.iterPosition]][0]
        self.iterPosition += 1
        return x

    def __getitem__(self, key):
        k = key.lower()
        return self.dict[k][1]

    def __setitem__(self, key, value):
        k = key.lower()
        self.dict[k] = (key, value)
        self.keyList = self.dict.keys()

    def has_key(self, key):
        k = key.lower()
        return k in self.keyList

    def __len__(self):
        return len(self.dict)

    def keys(self):
        return [v[0] for v in self.dict.values()]

    def values(self):
        return [v[1] for v in self.dict.values()]

    def items(self):
        return self.dict.values()

    def __contains__(self, item):
        return self.dict.has_key(item.lower())

    def __repr__(self):
        items = ", ".join([("%r: %r" % (k,v)) for k,v in self.items()])
        return "{%s}" % items

    def __str__(self):
        return repr(self)

# *****************************************
# Test Code
# *****************************************

if __name__ == '__main__':
    foundError = False

    # firstly create an empty caselessDict

(comment continued...)

Chris Hobbs 16 years, 10 months ago  # | flag

(...continued from previous comment)

    x = caselessDict()
    x['frEd'] = 76
    x['jOe'] = 92
    x['bERT'] = 54
    x['Bert'] = 53
    if x['bert'] != 53:
        print "Error 1"
        foundError = True
    shouldBe = [ 'Bert', 'jOe', 'frEd' ]
    for key in x:
        if not key in shouldBe:
            print "Error 2"
            foundError = True
        else:
            shouldBe.remove(key)
    if len(shouldBe) != 0:
        print "Error 2a"
        foundError = True
    if not 'frEd' in x:
        print "Error 3"
        foundError = True
    if not 'fRed' in x:
        print "Error 4"
        foundError = True
    if 'fReda' in x:
        print "Error 5"
        foundError = True
    y = x.keys()
    if len(y) != 3:
        print "Error 6"
        foundError = True
    for yy in y:
        if (yy != 'Bert') and (yy != 'jOe') and (yy != 'frEd'):
            print "Error 7"
            foundError = True
    if x['FRED'] != 76:
        print "Error 8"
        foundError = True
    if x['joe'] != 92:
        print "Error 9"
        foundError = True

    # then create a caselessDict from an existing dictionary

    y = { 'frEd' : 76, 'jOe' : 92, 'Bert' : 53 }
    x = caselessDict(y)
    if x['bert'] != 53:
        print "Error 10"
        foundError = True
    shouldBe = [ 'Bert', 'jOe', 'frEd' ]
    for key in x:
        if not key in shouldBe:
            print "Error 11"
            foundError = True
        else:
            shouldBe.remove(key)
    if len(shouldBe) != 0:
        print "Error 11a"
        foundError = True
    if not 'frEd' in x:
        print "Error 3"
        foundError = True
    if not 'fRed' in x:
        print "Error 12"
        foundError = True
    y = x.keys()
    if len(y) != 3:
        print "Error 13"
        foundError = True
    for yy in y:
        if (yy != 'Bert') and (yy != 'jOe') and (yy != 'frEd'):
            print "Error 14"
            foundError = True
    if x['FRED'] != 76:
        print "Error 15"
        foundError = True
    if x['joe'] != 92:
        print "Error 16"
        foundError = True
    if foundError == False:
        print "No errors found"
Wolf Siepen 16 years, 8 months ago  # | flag

Get method missing. To round off things:

def get(self, key, alt):
    if self.has_key(key):
        return self.__getitem__(key)
    return alt
Jeff Donner 16 years, 6 months ago  # | flag

Another approach, with 2.4. This relies on the fact that in 2.4 you can derive from 'dict' like any other class. Also, note that this /changes/ the key, to your canonical (here, lowercase) form; it doesn't leave it in the dictionary as given while continuing to match it case-insensitively as the previous examples do:

class CaselessDict(dict):
    def __init__(self, other=None):
        if other:
            # Doesn't do keyword args
            if isinstance(other, dict):
                for k,v in other.items():
                    dict.__setitem__(self, k.lower(), v)
            else:
                for k,v in other:
                    dict.__setitem__(self, k.lower(), v)

    def __getitem__(self, key):
        return dict.__getitem__(self, key.lower())

    def __setitem__(self, key, value):
        dict.__setitem__(self, key.lower(), value)

    def __contains__(self, key):
        return dict.__contains__(self, key.lower())

    def has_key(self, key):
        return dict.has_key(self, key.lower())

    def get(self, key, def_val=None):
        return dict.get(self, key.lower(), def_val)

    def setdefault(self, key, def_val=None):
        return dict.setdefault(self, key.lower(), def_val)

    def update(self, other):
        for k,v in other.items():
            dict.__setitem__(self, k.lower(), v)

    def fromkeys(self, iterable, value=None):
        d = CaselessDict()
        for k in iterable:
            dict.__setitem__(d, k.lower(), value)
        return d

    def pop(self, key, def_val=None):
        return dict.pop(self, key.lower(), def_val)

The idea is to override just what you must and leave as much as possible to the base class. As noted, it doesn't do keyword initialization. Other than that it should be pretty complete, and hopefully no slower than it must be.

Matti Uusitalo 11 years, 4 months ago  # | flag

Don't make a case insensitive dictionary. Implement a case insensitive version of a string instead. In that way, you can add case insensitive functionality with any collection.

Created by Sami Hangaslammi on Mon, 23 Jul 2001 (PSF)
Python recipes (4591)
Sami Hangaslammi's recipes (7)

Required Modules

  • (none specified)

Other Information and Tasks