This code is part of MoinMoin (http://moin.sourceforge.net/) and converts Python source code to HTML markup, rendering comments, keywords, operators, numeric and string literals in different colors.
It shows how to use the built-in keyword, token and tokenize modules to scan Python source code and re-emit it with no changes to its original formatting (which is the hard part).
The test code at the bottom of the module formats itself and launches a browser with the result.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | """
MoinMoin - Python Source Parser
"""
# Imports
import cgi, string, sys, cStringIO
import keyword, token, tokenize
#############################################################################
### Python Source Parser (does Hilighting)
#############################################################################
_KEYWORD = token.NT_OFFSET + 1
_TEXT = token.NT_OFFSET + 2
_colors = {
token.NUMBER: '#0080C0',
token.OP: '#0000C0',
token.STRING: '#004080',
tokenize.COMMENT: '#008000',
token.NAME: '#000000',
token.ERRORTOKEN: '#FF8080',
_KEYWORD: '#C00000',
_TEXT: '#000000',
}
class Parser:
""" Send colored python source.
"""
def __init__(self, raw, out = sys.stdout):
""" Store the source text.
"""
self.raw = string.strip(string.expandtabs(raw))
self.out = out
def format(self, formatter, form):
""" Parse and send the colored source.
"""
# store line offsets in self.lines
self.lines = [0, 0]
pos = 0
while 1:
pos = string.find(self.raw, '\n', pos) + 1
if not pos: break
self.lines.append(pos)
self.lines.append(len(self.raw))
# parse the source and write it
self.pos = 0
text = cStringIO.StringIO(self.raw)
self.out.write('<pre><font face="Lucida,Courier New">')
try:
tokenize.tokenize(text.readline, self)
except tokenize.TokenError, ex:
msg = ex[0]
line = ex[1][0]
self.out.write("<h3>ERROR: %s</h3>%s\n" % (
msg, self.raw[self.lines[line]:]))
self.out.write('</font></pre>')
def __call__(self, toktype, toktext, (srow,scol), (erow,ecol), line):
""" Token handler.
"""
if 0:
print "type", toktype, token.tok_name[toktype], "text", toktext,
print "start", srow,scol, "end", erow,ecol, "<br>"
# calculate new positions
oldpos = self.pos
newpos = self.lines[srow] + scol
self.pos = newpos + len(toktext)
# handle newlines
if toktype in [token.NEWLINE, tokenize.NL]:
self.out.write('\n')
return
# send the original whitespace, if needed
if newpos > oldpos:
self.out.write(self.raw[oldpos:newpos])
# skip indenting tokens
if toktype in [token.INDENT, token.DEDENT]:
self.pos = newpos
return
# map token type to a color group
if token.LPAR <= toktype and toktype <= token.OP:
toktype = token.OP
elif toktype == token.NAME and keyword.iskeyword(toktext):
toktype = _KEYWORD
color = _colors.get(toktype, _colors[_TEXT])
style = ''
if toktype == token.ERRORTOKEN:
style = ' style="border: solid 1.5pt #FF0000;"'
# send text
self.out.write('<font color="%s"%s>' % (color, style))
self.out.write(cgi.escape(toktext))
self.out.write('</font>')
if __name__ == "__main__":
import os, sys
print "Formatting..."
# open own source
source = open('python.py').read()
# write colorized version to "python.html"
Parser(source, open('python.html', 'wt')).format(None, None)
# load HTML page into browser
if os.name == "nt":
os.system("explorer python.html")
else:
os.system("netscape python.html &")
|
Thanks. We are using this recipe to colorize this very cookbook, thanks. I made a slight change though so that the script uses css and span to allow easy colour manipulation.
. Doesnt handle continued lines, using the \ operator.
\ works, at some places. In its original environment, the code works, see http://purl.net/wiki/python/MoinMoinColorizer. Can't say what causes it not to work in the Cookbook.
Easy to turn this into an Apache handler - serve up colorized .py files! Some small changes make it possible to invoke this as (a) a CGI script that uses the PATH_TRANSLATED CGI environment variable to know what file to colorize; (b) a command-line tool that takes the filename from the first argument; or (c) a filter that colorizes whatever it gets from stdin. See http://skew.org/~mike/colorize.py for my version.
To finish set it up as a handler in Apache, so that when you request a .py file, the file is served up as colorized HTML, you will need to save the script as colorize.cgi (not .py, lest it get confused), and add this to your .htaccess or httpd.conf:
Also make sure you have the Action module enabled in your httpd.conf.
... and also as a module! Based on your version, I made some additional enhancements:
make script usable as a module
use <class> tags and style sheet instead of <style> tags
when called as a script, add HTML header and footer
This version can be found here:
http://chrisarndt.de/en/software/python/colorize.html