Welcome, guest | Sign In | My Account | Store | Cart

With the introduction of Python 3.0 comes incompatibility issues. This recipe introduces that helpful utility "md5sum" compatible with the new release. It was written from scratch with version 3.0 in mind and attempts to be simple, readable, and useful to those people who need a cross-platform utility that can calculate checksums via Python. There is no documentation, but the program should be easy to understand for anyone who knows even a little about one of our favorite languages.

Python, 19 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import sys
import hashlib

def main():
    try:
        file = open(sys.argv[1], 'rb')
        md5 = hashlib.md5()
        buffer = file.read(2 ** 20)
        while buffer:
            md5.update(buffer)
            buffer = file.read(2 ** 20)
        file.close()
        print(md5.hexdigest())
    except:
        import os
        print('Usage: {0} <filename>'.format(os.path.basename(sys.argv[0])))

if __name__ == '__main__':
    main()

2 comments

Collin Stocks 15 years, 3 months ago  # | flag

I suggest using 1<<20 instead of 2**20, but that is just my personal preference. It won't improve the speed of your program measurably.

Stephen Chappell (author) 12 years, 7 months ago  # | flag

Thank you for suggesting the change! Since you reminded me of the shifting operations, I have been using the given idiom in programs since your comment. However, if you look at how 2 ** 20 and 1 << 20 are compiled into a function (using the dis module on example functions that you create), you will find that both expressions are evaluated, and the same constant is stored. While this may alter the time it takes to evaluate the function or method definition, execution time should be exactly the same.