Welcome, guest | Sign In | My Account | Store | Cart

Compare two images using the root mean squared analysis. A result close to 0 means a good match.

Python, 26 lines
# Original function

import ImageChops
import math, operator

def rmsdiff_1997(im1, im2):
    "Calculate the root-mean-square difference between two images"

    h = ImageChops.difference(im1, im2).histogram()

    # calculate rms
    return math.sqrt(reduce(operator.add,
        map(lambda h, i: h*(i**2), h, range(256))
    ) / (float(im1.size[0]) * im1.size[1]))

# The 2011 version using more recent Python idioms

def rmsdiff_2011(im1, im2):
    "Calculate the root-mean-square difference between two images"
    diff = ImageChops.difference(im1, im2)
    h = diff.histogram()
    sq = (value*(idx**2) for idx, value in enumerate(h))
    sum_of_squares = sum(sq)
    rms = math.sqrt(sum_of_squares/float(im1.size[0] * im1.size[1]))
    return rms

Image analysis is a science in itself as visual perception is very complicated but sometimes it is possible to do things simply. The general use case seems to be look for and highlight differences. For this it's difficult to beat the compare suite of ImageMagick. Of course, you can roll your own equivalent with Python and PIL.

However, I wanted a measure of "closeness" between two images - I am comparing a host of websites and checking that the right logo is more or less in the right place. This is incredibly easy to do with the naked eye but surprisingly difficult programmatically, at least I've found it so. Of course, effbot has already provided a basic comparison implementation based on the root mean square: http://effbot.org/zone/pil-comparing-images.htm. As I couldn't initially get this to work on my system I worked though it and was reminded of something Guido said at PyCon 2011 about accepting map, filter and reduce too easily into the language. Taken on their own these functions are just about readable but throw in some lambdas and I, at least, am lost. If anyone else is going to work with your code think about adding one or two lines for readability.

Fortunately, the introduction of list comprehensions and some aggregate functions (sum, max, etc.) have made their use more or less optional for general code with reduce being moved to the functools module for specific cases. Stepping through the nested calls:

sq is a generator expression that works through the histogram of the different images. enumerate handily gives us an index that we used to have to generate from the length of the list or a counter. Left-to-right ordering in the expression makes it easier to understand that we are working on the values from the for loop, avoiding the need of a lambda. We use a generator expression as this is an intermediary result.

sum_of_squares simply adds all the items in expression and is directly equivalent to `reduce(operator.add, sq). Big win in readability!

rms gives the square root of the sum of squares. Easy enough to plug sum(sq) directly for brevity while maintaining clarity but spelled out here for didactic purposes. Having to use float() to avoid integer division (this is written for Python 2.x) is probably the biggest wart in this line: as long as any the numbers is a float in the calculation then the result will also be a float.

The code isn't perfect for what I want but is okay for my current needs.

I'm including both implementations of the code for easy comparison. The Python Imaging Library from http://www.pythonware.com/ is required for this.


Sunjay Varma 13 years ago  # | flag

Can you post an example of how to use the results of the functions?


I cannot really because that will depend very much on your use case. For my current project comparing images of 70 x 70 pixels I have found 70 to be about the upper limit for two images to be visually fairly close but I would expect that to vary wildly in a different environment and you might want to tweak the comparison by say looking only at the dominant colour band.


images_are_the_same = rmsdiff_2011(im1, im2) < 70
Mark Krautheim 13 years ago  # | flag

I'm sure this works for you as you say, but it seems like a slightly more 'correct' version (assuming the images are RGB) would have this line:

sq = (value*(idx**2) for idx, value in enumerate(h))

modified thusly:

sq = (value*((idx%256)**2) for idx, value in enumerate(h))

because according to the PIL documentation:

"The histogram is returned as a list of pixel counts, one for each pixel value in the source image. If the image has more than one band, the histograms for all bands are concatenated (for example, the histogram for an "RGB" image contains 768 values)."

As originally written, I think more weight would be given to blue pixels than green, and more to green than red, right?

Thanks for posting this, I was in need of such a function.