ActiveState Code

Recipe 576644: Diff Two Dictionaries


Diff two dictionaries returning just the differences. If an item is not found, it is represented by the string "<KEYNOTFOUND>". If there is a better way, please share. :)

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
KEYNOTFOUND = '<KEYNOTFOUND>'       # KeyNotFound for dictDiff

def dict_diff(first, second):
    """ Return a dict of keys that differ with another config object.  If a value is
        not found in one fo the configs, it will be represented by KEYNOTFOUND.
        @param first:   Fist dictionary to diff.
        @param second:  Second dicationary to diff.
        @return diff:   Dict of Key => (first.val, second.val)
    """
    diff = {}
    # Check all keys in first dict
    for key in first.keys():
        if (not second.has_key(key)):
            diff[key] = (first[key], KEYNOTFOUND)
        elif (first[key] != second[key]):
            diff[key] = (first[key], second[key])
    # Check all keys in second dict to find missing
    for key in second.keys():
        if (not first.has_key(key)):
            diff[key] = (KEYNOTFOUND, second[key])
    return diff

Comments

  1. 1. At 10:43 a.m. on 5 feb 2009, Raymond Hettinger said:

    Try using set.symmetric_difference():

    >>> first = dict(a=1, b=2)
    >>> second = dict(b=2, c=3)
    >>> set(first) ^ set(second)
    set(['a', 'c'])
    
  2. 2. At 5:16 p.m. on 8 feb 2009, David Lambert said:

    Since dictionaries also have values you'd also have to prove that these are equal for the intersection of the sets of dictionary keys. If the value associated with the particular key in a "config object" has a known value then sets, not dictionaries, were probably a better data structure in the first place.

  3. 3. At 11:29 a.m. on 9 feb 2009, Michael Shepanski (the author) said:

    Good comments, thanks guys. :)

  4. 4. At 3:42 a.m. on 19 feb 2009, Tijmen said:

    I did not think of this myself but definately wanted to share:

    def not_in_list(list1,list2):
            dict1 = dict(zip(list1,list1))
            returnList2 = [x for x in list2 if x not in dict1]
    
            return returnList2
    
  5. 5. At 9:51 a.m. on 6 mar 2009, Radu Brumariu said:

    diff=dict()

    for key in first.keys()+second.keys(): try : x = first[key] except KeyError: diff[key] = (None,second[key]) try : x = second[key] except KeyError: diff[key] = (first[key],None)

    print diff

  6. 6. At 7:05 a.m. on 14 jul 2009, Lautaro Pecile said:

    A little more verbose version.

    KEYNOTFOUNDIN1 = '<KEYNOTFOUNDIN1>'       # KeyNotFound for dictDiff
    KEYNOTFOUNDIN2 = '<KEYNOTFOUNDIN2>'       # KeyNotFound for dictDiff
    
    def dict_diff(first, second):
        """ Return a dict of keys that differ with another config object.  If a value is
            not found in one fo the configs, it will be represented by KEYNOTFOUND.
            @param first:   Fist dictionary to diff.
            @param second:  Second dicationary to diff.
            @return diff:   Dict of Key => (first.val, second.val)
        """
        diff = {}
        sd1 = set(first)
        sd2 = set(second)
        #Keys missing in the second dict
        for key in sd1.difference(sd2):
            diff[key] = KEYNOTFOUNDIN2
        #Keys missing in the first dict
        for key in sd2.difference(sd1):
            diff[key] = KEYNOTFOUNDIN1
        #Check for differences
        for key in sd1.intersection(sd2):
            if first[key] != second[key]:
                diff[key] = (first[key], second[key])    
        return diff
    
  7. 7. At 6:16 a.m. on 18 jul 2009, Hugh Brown said:

    Here's my code for this:

    class DictDiffer(object):
      """
      Calculate the difference between two dictionaries as:
      (1) items added
      (2) items removed
      (3) keys same in both but changed values
      (4) keys same in both and unchanged values
      """
      def __init__(self, current_dict, past_dict):
        self.current_dict, self.past_dict = current_dict, past_dict
        self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
        self.intersect = self.set_current.intersection(self.set_past)
      def added(self):
        return self.set_current - self.intersect
      def removed(self):
        return self.set_past - self.intersect
      def changed(self):
        return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
      def unchanged(self):
        return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])
    

    Not quite the same interface for output, but the algorithm is pretty clean.

Sign in to comment