Displays the total number of code lines in a single source file or for all the files of the same type in an entire directory. User provides a filename including the extension, or an just the extension preceded by a wildcard character.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | #!/usr/local/bin/python
import re
import os
import sys
import glob
# regex to handle various comment styles.
expression = re.compile('^\s*?[/*|//|#][*]*.*?')
def parse(sourcefile):
lcount, ccount = 0, 0
try:
file = open(sourcefile, 'r')
except IOError:
sys.exit(0)
for line in file.readlines():
lcount += 1
if expression.match(line):
ccount += 1
file.close()
return lcount, ccount
def main():
# total line count, total comment count
tlc = tcc = 0
if not len(sys.argv) > 1:
print 'Provide filename or extension'
else:
for file in glob.glob(sys.argv[1]):
lc, cc = parse(file)
print 'processing file: %(file)s %(lc)s' % locals()
tlc += lc
tcc += cc
print 'total lines = %(tlc)s\ntotal comments = %(tcc)s' % locals()
if __name__ == "__main__": main()
|
Wanting a quick and dirty way to determine the total number of lines in a Java project I was writing, Python was the perfect choice for cooking up a script to do so. The tool is somewhat limited as it only works for Java, C++, and Python or Perl files, but these are the languages students will be most familiar.
anyone care to handle docstrings? I wanted this the other day, but I need my (extensive :) docstrings to be counted as comments in order to get an accurate result. The above solution can be had in a single line of shell script using egrep. I'm looking for something that involves the python parser perhaps.