Python Minifier: Reduces the size of Python code for use on embedded platforms. Performs the following:
- Removes docstrings.
- Removes comments.
- Minimizes code indentation.
- Joins multiline pairs of parentheses, braces, and brackets (and removes extraneous whitespace within).
- Preserves shebangs and encoding info (e.g. "# -- coding: utf-8 --")
| Python |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 | #!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# pyminifier.py
#
# Copyright 2009 Dan McDougall <YouKnowWho@YouKnowWhat.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; Version 3 of the License
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, the license can be downloaded here:
#
# http://www.gnu.org/licenses/gpl.html
# Meta
__version__ = '1.1'
__license__ = "GNU General Public License (GPL) Version 3"
__version_info__ = (1, 1)
__author__ = 'Dan McDougall <YouKnowWho@YouKnowWhat.com>'
import os, sys, re
from pyparsing import QuotedString, Suppress, Keyword, Optional, Word, Literal, ZeroOrMore, alphanums, restOfLine, replaceWith, pythonStyleComment, printables
"""
Python Minifier: Reduces the size of Python code for use on embedded platforms.
Performs the following:
1) Removes docstrings.
2) Removes comments.
3) Minimizes code indentation.
4) Joins multiline pairs of parentheses, braces, and brackets (and removes extraneous whitespace within).
5) Preserves shebangs and encoding info (e.g. "# -- coding: utf-8 --").
"""
# Compile our regular expressions for speed
multiline_quoted_string_regex = re.compile(r'(\'\'\'|\"\"\")')
not_quoted_string_regex = re.compile(r'(\".*\'\'\'.*\"|\'.*\"\"\".*\')')
double_quoted_string_regex = re.compile(r'((?<!\\)".*?(?<!\\)")')
single_quoted_string_regex = re.compile(r"((?<!\\)'.*?(?<!\\)')")
whitespace = re.compile('\s*')
trailing_newlines = re.compile(r'\n\n')
shebang = re.compile('^#\!.*$')
encoding = re.compile(".*coding[:=]\s*([-\w.]+)")
comment = re.compile("(?!(\'|\")*#.*(\'|\"))\s*#.*")
blank_lines = re.compile("\n\s*\n")
#parens = re.compile("\((?P<parens>[^()]|\(\))*\)", re.MULTILINE|re.DOTALL)
multiline_indicator = re.compile('\\\\(\s*#.*)?\n') # Also removes trailing comments: "test = 'blah \ # comment here"
# Operators (for future use)
#commas = re.compile("(?!\'.*\')\s*\,\s*\n*\s*") # To be replaced with ","
#plus_signs = re.compile("(?!\'.*\')\s*\+\s*\n*\s*") # To be replaced with "+"
#minus_signs = re.compile("(?!\'.*\')\s*\-\s*\n*\s*") # To be replaced with "-"
#multiply_signs = re.compile("(?!\'.*\')\s*\*\s*\n*\s*") # To be replaced with "*"
#divide_signs = re.compile("(?!\'.*\')\s*\/\s*\n*\s*") # To be replaced with "/"
#less_signs = re.compile("(?!\'.*\')\s*\<\s*\n*\s*") # To be replaced with "<"
#greater_signs = re.compile("(?!\'.*\')\s*\>\s*\n*\s*") # To be replaced with ">"
#equal_signs = re.compile("(?!\'.*\')\s*\s*\=\s*\n*\s*") # To be replaced with "="
#equals_signs = re.compile("(?!\'.*\')\s*\=\=\s*\n*\s*") # To be replaced with "=="
#not_equal_signs = re.compile("(?!\'.*\')\s*\!\=\s*\n*\s*") # To be replaced with "!="
#add_assign = re.compile("(?!\'.*\')\s*\+\=\s*\n*\s*") # To be replaced with "+="
#sub_assign = re.compile("(?!\'.*\')\s*\-\=\s*\n*\s*") # To be replaced with "-="
#modulus_assign = re.compile("(?!\'.*\')\s*\%\=\s*\n*\s*") # To be replaced with "%="
#multiply_assign = re.compile("(?!\'.*\')\s*\*\=\s*\n*\s*") # To be replaced with "*="
#powers_assign = re.compile("(?!\'.*\')\s*\*\*\=\s*\n*\s*") # To be replaced with "**="
#divide_assign = re.compile("(?!\'.*\')\s*\/\=\s*\n*\s*") # To be replaced with "/="
#truncate_divide_assign = re.compile("(?!\'.*\')\s*\/\/\=\s*\n*\s*") # To be replaced with "*//="
#truncated_divide_signs = re.compile("(?!\'.*\')\s*\/\/\s*\n*\s*") # To be replaced with "//"
#powers_signs = re.compile("(?!\'.*\')\s*\*\*\s*\n*\s*") # To be replaced with "**"
#left_shift_signs = re.compile("(?!\'.*\')\s*\<\<\s*\n*\s*") # To be replaced with "<<"
#right_shift_signs = re.compile("(?!\'.*\')\s*\*>\>\s*\n*\s*") # To be replaced with ">>"
#modulos_signs = re.compile("(?!\'.*\')\s*\%\s*\n*\s*") # To be replaced with "%"
#and_signs = re.compile("(?!\'.*\')\s*\&\s*\n*\s*") # To be replaced with "&"
#or_signs = re.compile("(?!\'.*\')\s*\|\s*\n*\s*") # To be replaced with "|"
#xor_signs = re.compile("(?!\'.*\')\s*\^\s*\n*\s*") # To be replaced with "^"
#negation_signs = re.compile("(?!\'.*\')\s*\~\s*\n*\s*") # To be replaced with "~"
def substitute_matches(matchlist, opener_regex, closer_regex, opener_sub, closer_sub):
"""Replaces 'opener' and 'closer' in 'matchlist' with 'opener_sub' and 'closer_sub'"""
preoutput = ""
for item in matchlist:
if item:
if item[0] == '"':
# Sub out all the matching pairs with something so they don't match later on (we'll change them back at the end)
item = opener_regex.sub('%s' % opener_sub, item)
item = closer_regex.sub('%s' % closer_sub, item)
preoutput += item
else:
preoutput += item
line = "".join(preoutput)
return line
def join_multiline_pairs(text, pair="()"):
"""Finds and removes newlines in multiline matching pairs of characters in 'text'.
For example, "(.*\n.*), {.*\n.*}, or [.*\n.*]").
By default it joins parens () but it will join any two characters it is passed in the 'pair' variable.
"""
# Readability variables
opener = pair[0]
closer = pair[1]
# Tracking variables
inside_pair = False
inside_quotes = False
inside_double_quotes = False
inside_single_quotes = False
quoted_string = False
openers = 0
closers = 0
linecount = 0
# Static variables
opener_sub = '###OPENER###'
closer_sub = '###CLOSER###'
# Regular expressions
opener_regex = re.compile('\%s' % opener)
closer_regex = re.compile('\%s' % closer)
opener_sub_regex = re.compile('(?!(\'|\"))%s(?!(\'|\"))' % opener_sub)
closer_sub_regex = re.compile('(?!(\'|\"))%s(?!(\'|\"))' % closer_sub)
output = ""
for line in text.split('\n'):
escaped = False
multline_match = multiline_quoted_string_regex.search(line)
not_quoted_string_match = not_quoted_string_regex.search(line)
if multline_match and not not_quoted_string_match and not quoted_string:
output += line + '\n'
quoted_string = True
elif quoted_string and multiline_quoted_string_regex.search(line) and not quoted_string:
output += line + '\n'
quoted_string = False
elif opener_regex.search(line) or closer_regex.search(line) or inside_pair:
for character in line:
if character == opener:
if not escaped:
openers += 1
inside_pair = True
output += character
else:
escaped = False
output += character
elif character == closer:
if not escaped:
if openers == (closers + 1) and openers != 0:
closers = 0
openers = 0
inside_pair = False
output += character
else:
closers += 1
output += character
else:
escaped = False
output += character
elif character == '\\':
if escaped:
escaped = False
output += character
else:
escaped = True
output += character
elif character == '"' and escaped:
output += character
escaped = False
elif character == "'" and escaped:
output += character
escaped = False
elif character == '"' and inside_quotes:
if inside_single_quotes:
output += character
else:
inside_quotes = False
inside_double_quotes = False
output += character
elif character == "'" and inside_quotes:
if inside_double_quotes:
output += character
else:
inside_quotes = False
inside_single_quotes = False
output += character
elif character == '"' and not inside_quotes:
inside_quotes = True
inside_double_quotes = True
output += character
elif character == "'" and not inside_quotes:
inside_quotes = True
inside_single_quotes = True
output += character
elif character == ' ' and inside_pair and not inside_quotes:
pass
else:
if escaped:
escaped = False
output += character
if inside_pair == False:
output += '\n'
else:
output += line + '\n'
# Clean up
output = opener_sub_regex.sub('%s' % opener, output)
output = closer_sub_regex.sub('%s' % closer, output)
output = trailing_newlines.sub('\n', output)
return output
def dedent(source):
"""Minimizes indentation to save precious bytes"""
indentation_list = []
output = ""
# First find all the levels of indentation
for line in source.split('\n'):
indentation_level = len(line.rstrip()) - len(line.strip())
if indentation_level not in indentation_list:
indentation_list.append(indentation_level)
# Now we can reduce each line's indentation to the minimal value
for line in source.split('\n'):
indentation_level = len(line.rstrip()) - len(line.strip())
for i,v in enumerate(indentation_list):
if indentation_level == v:
output += " " * i + line.lstrip() + "\n"
return output
#def reduce_operators(source):
#"""Removes spaces and newlines between operators"""
source = multiline_indicator.sub('', source)
# The following is meant to remove space between operators but it currently has issues (working on it).
#source = commas.sub(',', source)
#source = plus_signs.sub('+', source)
#source = minus_signs.sub('-', source)
#source = multiply_signs.sub('*', source)
#source = divide_signs.sub('/', source)
#source = less_signs.sub('<', source)
#source = greater_signs.sub('>', source)
#source = equal_signs.sub('=', source)
#source = equals_signs.sub('==', source)
#source = not_equal_signs.sub('<!=', source)
#source = add_assign.sub('+=', source)
#source = sub_assign.sub('-=', source)
#source = modulus_assign.sub('%=', source)
#source = multiply_assign.sub('*=', source)
#source = powers_assign.sub('**=', source)
#source = divide_assign.sub('/=', source)
#source = truncate_divide_assign.sub('//=', source)
#source = truncated_divide_signs.sub('//', source)
#source = powers_signs.sub('**', source)
#source = left_shift_signs.sub('<<', source)
#source = right_shift_signs.sub('>>', source)
#source = modulos_signs.sub('%', source)
#source = and_signs.sub('&', source)
#source = or_signs.sub('|', source)
#source = xor_signs.sub('^', source)
#source = negation_signs.sub('~', source)
#return source
def empty_method():
"""Just a test method. This should be replaced with 'def empty_method: pass'"""
def fix_empty_methods(source):
"""Appends 'pass' to empty methods/functions (i.e. where there was nothing but a docstring before we removed docstrings =)"""
def_indentation_level = 0
output = ""
just_matched = False
previous_line = None
method = re.compile(r'^\s*def\s*.*\(.*\):.*$')
for line in source.split('\n'):
if len(line.strip()) > 0: # Don't look at blank lines
if just_matched == True:
this_indentation_level = len(line.rstrip()) - len(line.strip())
if def_indentation_level == this_indentation_level:
# This method is empty, insert a 'pass' statement
output += "%s pass\n%s\n" % (previous_line, line)
else:
output += "%s\n%s\n" % (previous_line, line)
just_matched = False
elif method.match(line):
def_indentation_level = len(line) - len(line.strip())
just_matched = True
previous_line = line
else:
output += "%s\n" % line
else:
output += "\n"
return output
def remove_docstrings(source):
"""Removes docstrings from the source"""
method = (
Suppress(Keyword("def") +
Word(alphanums+"_") +
'(' + ZeroOrMore(Word(alphanums+"_")) + ')' + ":")
)
doc = Keyword("__doc__")
# This removes multiline docstrings
string = (
(QuotedString(quoteChar='\"\"\"', escChar='\\', multiline=True) | \
QuotedString(quoteChar="\'\'\'", escChar='\\', multiline=True))
)
multiLineDocstring = (Optional(doc + Literal('=') + Optional('\\')) + string)
multiLineDocstring.setParseAction(replaceWith(""))
source = multiLineDocstring.transformString(source)
# This removes single line docstrings
singleLineDocstring = (
Suppress(method) +
(QuotedString(quoteChar='"', escChar='\\', multiline=False) | \
QuotedString(quoteChar="'", escChar='\\', multiline=False))
)
singleLineDocstring.setParseAction(replaceWith(""))
return singleLineDocstring.transformString(source)
def minify(source):
"""Remove all docstrings, comments, blank lines, and minimize code indentation from 'source' (string)."""
preserved_shebang = None
preserved_encoding = None
source = remove_docstrings(source)
# This loop is for things that must be preserved precisely
for line in source.split('\n')[0:2]:
# Save the first comment line if it starts with a shebang (#!) so we can re-add it later
if shebang.match(line):
preserved_shebang = line
# Save the encoding string so we can re-add it later
if encoding.match(line):
preserved_encoding = line
# Remove comments
source = comment.sub('', source)
# TODO: This currently isn't working for some reason
# probably due to escape character detection in join_multiline_pairs()
# Remove multilines (e.g. lines that end with '\' followed by a newline)
source = multiline_indicator.sub('', source)
# Join multiline pairs of parens, brackets, and braces
source = join_multiline_pairs(source)
#source = join_multiline_pairs(source, '[]')
#source = join_multiline_pairs(source, '{}')
# Re-add preseved items
if preserved_encoding:
source = preserved_encoding + "\n" + source
if preserved_shebang:
source = preserved_shebang + "\n" + source
# Minimize indentation
source = dedent(source)
# Remove empty (i.e. single line) methods/functions
source = fix_empty_methods(source)
# Remove blank lines
source = blank_lines.sub('\n', source)
return source
def main():
if len(sys.argv) > 1:
source = open(sys.argv[1]).read()
print minify(source)
else:
print "Usage: pyminifier.py <python source file>"
if __name__ == "__main__":
main()
|
Discussion
I wrote this so I could minimize the size of python code being run on embedded platforms (e.g. OpenWRT). minified + zipped modules can save a lot of space when applied to a large number of files. Here's an example of the ouput minifying itself (Note: For whatever reason this website doesn't display the indentation properly):
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__version__ = '1.1'
__license__ = "GNU General Public License (GPL) Version 3"
__version_info__ = (1,1)
__author__ = 'Dan McDougall <YouKnowWho@YouKnowWhat.com>'
import os, sys, re
from pyparsing import QuotedString, Suppress, Keyword, Optional, Word, Literal, ZeroOrMore, alphanums, restOfLine, replaceWith, pythonStyleComment, printables
multiline_quoted_string_regex = re.compile(r'(\'\'\'|\"\"\")')
not_quoted_string_regex = re.compile(r'(\".*\'\'\'.*\"|\'.*\"\"\".*\')')
double_quoted_string_regex = re.compile(r'((?<!\\)".*?(?<!\\)")')
single_quoted_string_regex = re.compile(r"((?<!\\)'.*?(?<!\\)')")
whitespace = re.compile('\s*')
trailing_newlines = re.compile(r'\n\n')
shebang = re.compile('^#\!.*$')
encoding = re.compile(".*coding[:=]\s*([-\w.]+)")
comment = re.compile("(?!(\'|\")*#.*(\'|\"))\s*#.*")
blank_lines = re.compile("\n\s*\n")
multiline_indicator = re.compile('\\\\(\s*#.*)?\n')
def substitute_matches(matchlist,opener_regex,closer_regex,opener_sub,closer_sub):
preoutput = ""
for item in matchlist:
if item:
if item[0] == '"':
item = opener_regex.sub('%s'%opener_sub,item)
item = closer_regex.sub('%s'%closer_sub,item)
preoutput += item
else:
preoutput += item
line = "".join(preoutput)
return line
def join_multiline_pairs(text,pair="()"):
opener = pair[0]
closer = pair[1]
inside_pair = False
inside_quotes = False
inside_double_quotes = False
inside_single_quotes = False
quoted_string = False
openers = 0
closers = 0
linecount = 0
opener_sub = '###OPENER###'
closer_sub = '###CLOSER###'
opener_regex = re.compile('\%s'%opener)
closer_regex = re.compile('\%s'%closer)
opener_sub_regex = re.compile('(?!(\'|\"))%s(?!(\'|\"))'%opener_sub)
closer_sub_regex = re.compile('(?!(\'|\"))%s(?!(\'|\"))'%closer_sub)
output = ""
for line in text.split('\n'):
escaped = False
multline_match = multiline_quoted_string_regex.search(line)
not_quoted_string_match = not_quoted_string_regex.search(line)
if multline_match and not not_quoted_string_match and not quoted_string:
output += line + '\n'
quoted_string = True
elif quoted_string and multiline_quoted_string_regex.search(line) and not quoted_string:
output += line + '\n'
quoted_string = False
elif opener_regex.search(line) or closer_regex.search(line) or inside_pair:
for character in line:
if character == opener:
if not escaped:
openers += 1
inside_pair = True
output += character
else:
escaped = False
output += character
elif character == closer:
if not escaped:
if openers == (closers+1) and openers != 0:
closers = 0
openers = 0
inside_pair = False
output += character
else:
closers += 1
output += character
else:
escaped = False
output += character
elif character == '\\':
if escaped:
escaped = False
output += character
else:
escaped = True
output += character
elif character == '"' and escaped:
output += character
escaped = False
elif character == "'" and escaped:
output += character
escaped = False
elif character == '"' and inside_quotes:
if inside_single_quotes:
output += character
else:
inside_quotes = False
inside_double_quotes = False
output += character
elif character == "'" and inside_quotes:
if inside_double_quotes:
output += character
else:
inside_quotes = False
inside_single_quotes = False
output += character
elif character == '"' and not inside_quotes:
inside_quotes = True
inside_double_quotes = True
output += character
elif character == "'" and not inside_quotes:
inside_quotes = True
inside_single_quotes = True
output += character
elif character == ' ' and inside_pair and not inside_quotes:
pass
else:
if escaped:
escaped = False
output += character
if inside_pair == False:
output += '\n'
else:
output += line + '\n'
output = opener_sub_regex.sub('%s'%opener,output)
output = closer_sub_regex.sub('%s'%closer,output)
output = trailing_newlines.sub('\n',output)
return output
def dedent(source):
indentation_list = []
output = ""
for line in source.split('\n'):
indentation_level = len(line.rstrip()) - len(line.strip())
if indentation_level not in indentation_list:
indentation_list.append(indentation_level)
for line in source.split('\n'):
indentation_level = len(line.rstrip()) - len(line.strip())
for i,v in enumerate(indentation_list):
if indentation_level == v:
output += " " * i + line.lstrip() + "\n"
return output
source = multiline_indicator.sub('',source)
def empty_method(): pass
def fix_empty_methods(source):
def_indentation_level = 0
output = ""
just_matched = False
previous_line = None
method = re.compile(r'^\s*def\s*.*\(.*\):.*$')
for line in source.split('\n'):
if len(line.strip()) > 0:
if just_matched == True:
this_indentation_level = len(line.rstrip()) - len(line.strip())
if def_indentation_level == this_indentation_level:
output += "%s pass\n%s\n" % (previous_line,line)
else:
output += "%s\n%s\n" % (previous_line,line)
just_matched = False
elif method.match(line):
def_indentation_level = len(line) - len(line.strip())
just_matched = True
previous_line = line
else:
output += "%s\n" % line
else:
output += "\n"
return output
def remove_docstrings(source):
method = (Suppress(Keyword("def")+Word(alphanums+"_")+'('+ZeroOrMore(Word(alphanums+"_"))+')'+":"))
doc = Keyword("__doc__")
string = ((QuotedString(quoteChar='\"\"\"',escChar='\\',multiline=True)|QuotedString(quoteChar="\'\'\'",escChar='\\',multiline=True)))
multiLineDocstring = (Optional(doc+Literal('=')+Optional('\\'))+string)
multiLineDocstring.setParseAction(replaceWith(""))
source = multiLineDocstring.transformString(source)
singleLineDocstring = (Suppress(method)+(QuotedString(quoteChar='"',escChar='\\',multiline=False)|QuotedString(quoteChar="'",escChar='\\',multiline=False)))
singleLineDocstring.setParseAction(replaceWith(""))
return singleLineDocstring.transformString(source)
def minify(source):
preserved_shebang = None
preserved_encoding = None
source = remove_docstrings(source)
for line in source.split('\n')[0:2]:
if shebang.match(line):
preserved_shebang = line
if encoding.match(line):
preserved_encoding = line
source = comment.sub('',source)
source = multiline_indicator.sub('',source)
source = join_multiline_pairs(source)
if preserved_encoding:
source = preserved_encoding + "\n" + source
if preserved_shebang:
source = preserved_shebang + "\n" + source
source = dedent(source)
source = fix_empty_methods(source)
source = blank_lines.sub('\n',source)
return source
def main():
if len(sys.argv) > 1:
source = open(sys.argv[1]).read()
print minify(source)
else:
print "Usage: pyminifier.py <python source file>"
if __name__ == "__main__":
main()


Comments
Discarding all comments removes the encoding. If the target script uses e.g. utf-8 with non-ascii characters the output will generate a syntax error.
Good catch. I'll update the script to fix that (shouldn't be too hard).
Note: It also messes up quoted # signs like this:
...results in
Comment = ". Still trying to figure out how to fix that.Good shot. You could save a few more bytes by : -transforming multiline instructions into single line instrustions. -removing spaces around operators. e.g. singleLineDocstring = ( method + (QuotedString(quoteChar='"', escChar='\', multiline=False) | \ QuotedString(quoteChar="'", escChar='\', multiline=False)) ) can be transformed into: singleLineDocstring=(method+(QuotedString(quoteChar='"',escChar='\',multiline=False)|QuotedString(quoteChar="'",escChar='\',multiline=False)))
You can achieve further size reduction by obfuscating the code.
I didn't find any solid estimation of size reduction brought by obfuscation of Python code but Yahoo dev networks claims that, applied to CSS and Javascrip:
"In a survey of ten top U.S. web sites, minification achieved a 21% size reduction versus 25% for obfuscation." (http://developer.yahoo.com/performance/rules.html)
I guess the advantage of obfuscation over minification is even more significant applied to Python than applied to CSS and Javascrip, due to the respective syntax of these languages.
Though, obfuscors are more complex than minifiers, and obfuscated code is a pain if you have to debug it.
Yeah, I thought about adding some obfuscating stuff to this but I changed my mind because it makes debugging nary impossible and when you're working on embedded platforms you often can't debug your code anywhere else but the device itself.
Also, I had originally planned to reduce spaces between operators but I never got around to it. It is in the TODO list =)
UPDATE: It took FOREVER (and a lot of code) to work it all out but I've finally got this joining multiline pairs of parens, braces, and brackets (and removing unnecessary whitespace inside them). So something like this:
Becomes...
I googled and googled but I never did find another example of code that does that. Not even in another language. The hardest part was getting it to ignore things in triple double/single quotes and dealing with escaped characters (especially code that literally has something like "'\'").
Almost forgot: I've also got it preserving shebangs (#!/usr/bin/env python) and encoding strings now (per Daniel Lepage's comment).
Update (rev 10): Fixed some bugs where it wasn't working properly in certain (odd) situations. It also joins multi-lines (i.e. that end in '\') properly again.
It doesn't work when the original code includes lines like the following one:
print "this is an example # of line which fails"
pyminifier interprets everything after the char # as a comment and, therefore, the rest of the code is not well processed.
If I give it a line with redundant brackets:
this gets converted to
i.e. with no space between the 'not' and the variable
The problem mentioned by Robert Marshall can be fixed by changing
to
or even better:
jcaballero.hep mentions a problem that I haven't been able to fix and is a showstopper for me. It is some sort of error in the comment RegExp. I will be very glad if anyone would solve it.
Also, I didn't try this, but there will be trouble on """ ''' asd ''' """
(which could very reasonable appear e.g. in doctests)
Sign in to comment