Python code minifier « Python recipes

Update 05/25/2014: Pyminifier 2.0 has been released and now lives on Github: https://github.com/liftoff/pyminifier (docs are here: http://liftoff.github.io/pyminifier/). The code below is very out-of-date but will be left alone for historical purposes.

Python Minifier: Reduces the size of Python code for use on embedded platforms. Performs the following:

Removes docstrings.
Removes comments.
Removes blank lines.
Minimizes code indentation.
Joins multiline pairs of parentheses, braces, and brackets (and removes extraneous whitespace within).
Preserves shebangs and encoding info (e.g. "# -- coding: utf-8 --")
NEW: Optionally, produces a bzip2 or gzip-compressed self-extracting python script containing the minified source for ultimate minification.

Update 09/23/2010: Version 1.4.1: Fixed an indentation bug when operators such as @ and open parens started a line.

Update 09/18/2010: Version 1.4:

Added some command line options to save the result to an output file.
Added the ability to save the result as a bzip2 or gzip-compressed self-extracting python script (which is kinda neat--try it!).
Updated some of the docstrings to provide more examples of what each function does.

Update 06/02/2010: Version 1.3: Rewrote several functions to use Python's built-in tokenizer module (which I just discovered despite being in Python since version 2.2). This negated the requirement for pyparsing and improved performance by an order of magnitude. It also fixed some pretty serious bugs with dedent() and reduce_operators().

PLEASE POST A COMMENT IF YOU ENCOUNTER A BUG!

      #!/usr/bin/env python
# -*- coding: utf-8 -*-
#
#       pyminifier.py
#
#       Copyright 2009 Dan McDougall <YouKnowWho@YouKnowWhat.com>
#
#       This program is free software; you can redistribute it and/or modify
#       it under the terms of the GNU General Public License as published by
#       the Free Software Foundation; Version 3 of the License
#
#       This program is distributed in the hope that it will be useful,
#       but WITHOUT ANY WARRANTY; without even the implied warranty of
#       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#       GNU General Public License for more details.
#
#       You should have received a copy of the GNU General Public License
#       along with this program; if not, the license can be downloaded here:
#
#       http://www.gnu.org/licenses/gpl.html

# Meta
__version__ = '1.4.1'
__license__ = "GNU General Public License (GPL) Version 3"
__version_info__ = (1, 4, 1)
__author__ = 'Dan McDougall <YouKnowWho@YouKnowWhat.com>'

"""
**Python Minifier:**  Reduces the size of (minifies) Python code for use on
embedded platforms.

Performs the following:
     - Removes docstrings.
     - Removes comments.
     - Minimizes code indentation.
     - Joins multiline pairs of parentheses, braces, and brackets (and removes extraneous whitespace within).
     - Preserves shebangs and encoding info (e.g. "# -- coding: utf-8 --").

Various examples and edge cases are sprinkled throughout the pyminifier code so
that it can be tested by minifying itself.  The way to test is thus:

.. code-block:: bash

    $ python pyminifier.py pyminifier.py > minified_pyminifier.py
    $ python minified_pyminifier.py pyminifier.py > this_should_be_identical.py
    $ diff minified_pyminifier.py this_should_be_identical.py
    $

If you get an error executing minified_pyminifier.py or
'this_should_be_identical.py' isn't identical to minified_pyminifier.py then
something is broken.
"""

import sys, re, cStringIO, tokenize
from optparse import OptionParser

# Compile our regular expressions for speed
multiline_quoted_string = re.compile(r'(\'\'\'|\"\"\")')
not_quoted_string = re.compile(r'(\".*\'\'\'.*\"|\'.*\"\"\".*\')')
trailing_newlines = re.compile(r'\n\n')
shebang = re.compile('^#\!.*$')
encoding = re.compile(".*coding[:=]\s*([-\w.]+)")
multiline_indicator = re.compile('\\\\(\s*#.*)?\n')
# The above also removes trailing comments: "test = 'blah \ # comment here"

# These aren't used but they're a pretty good reference:
double_quoted_string = re.compile(r'((?<!\\)".*?(?<!\\)")')
single_quoted_string = re.compile(r"((?<!\\)'.*?(?<!\\)')")
single_line_single_quoted_string = re.compile(r"((?<!\\)'''.*?(?<!\\)''')")
single_line_double_quoted_string = re.compile(r"((?<!\\)'''.*?(?<!\\)''')")

def remove_comments_and_docstrings(source):
    """
    Returns 'source' minus comments and docstrings.

    **Note**: Uses Python's built-in tokenize module to great effect.

    Example:

    .. code-block:: python

        def noop(): # This is a comment
            '''
            Does nothing.
            '''
            pass # Don't do anything

    Will become:

    .. code-block:: python

        def noop():
            pass
    """
    io_obj = cStringIO.StringIO(source)
    out = ""
    prev_toktype = tokenize.INDENT
    last_lineno = -1
    last_col = 0
    for tok in tokenize.generate_tokens(io_obj.readline):
        token_type = tok[0]
        token_string = tok[1]
        start_line, start_col = tok[2]
        end_line, end_col = tok[3]
        ltext = tok[4]
        # The following two conditionals preserve indentation.
        # This is necessary because we're not using tokenize.untokenize()
        # (because it spits out code with copious amounts of oddly-placed
        # whitespace).
        if start_line > last_lineno:
            last_col = 0
        if start_col > last_col:
            out += (" " * (start_col - last_col))
        # Remove comments:
        if token_type == tokenize.COMMENT:
            pass
        # This series of conditionals removes docstrings:
        elif token_type == tokenize.STRING:
            if prev_toktype != tokenize.INDENT:
        # This is likely a docstring; double-check we're not inside an operator:
                if prev_toktype != tokenize.NEWLINE:
                    # Note regarding NEWLINE vs NL: The tokenize module
                    # differentiates between newlines that start a new statement
                    # and newlines inside of operators such as parens, brackes,
                    # and curly braces.  Newlines inside of operators are
                    # NEWLINE and newlines that start new code are NL.
                    # Catch whole-module docstrings:
                    if start_col > 0:
                        # Unlabelled indentation means we're inside an operator
                        out += token_string
                    # Note regarding the INDENT token: The tokenize module does
                    # not label indentation inside of an operator (parens,
                    # brackets, and curly braces) as actual indentation.
                    # For example:
                    # def foo():
                    #     "The spaces before this docstring are tokenize.INDENT"
                    #     test = [
                    #         "The spaces before this string do not get a token"
                    #     ]
        else:
            out += token_string
        prev_toktype = token_type
        last_col = end_col
        last_lineno = end_line
    return out

def reduce_operators(source):
    """
    Remove spaces between operators in 'source' and returns the result.

    Example:

    .. code-block:: python

        def foo(foo, bar, blah):
            test = "This is a %s" % foo

    Will become:

    .. code-block:: python

        def foo(foo,bar,blah):
            test="This is a %s"%foo
    """
    io_obj = cStringIO.StringIO(source)
    remove_columns = []
    out = ""
    out_line = ""
    prev_toktype = tokenize.INDENT
    prev_tok = None
    last_lineno = -1
    last_col = 0
    lshift = 1
    for tok in tokenize.generate_tokens(io_obj.readline):
        token_type = tok[0]
        token_string = tok[1]
        start_line, start_col = tok[2]
        end_line, end_col = tok[3]
        ltext = tok[4]
        if start_line > last_lineno:
            last_col = 0
        if start_col > last_col:
            out_line += (" " * (start_col - last_col))
        if token_type == tokenize.OP:
            # Operators that begin a line such as @ or open parens should be
            # left alone
            start_of_line_types = [ # These indicate we're starting a new line
                tokenize.NEWLINE, tokenize.DEDENT, tokenize.INDENT]
            if prev_toktype not in start_of_line_types:
                # This is just a regular operator; remove spaces
                remove_columns.append(start_col) # Before OP
                remove_columns.append(end_col+1) # After OP
        if token_string.endswith('\n'):
            out_line += token_string
            if remove_columns:
                for col in remove_columns:
                    col = col - lshift
                    try:
            # This was really handy for debugging (looks nice, worth saving):
                        #print out_line + (" " * col) + "^"
                        # The above points to the character we're looking at
                        if out_line[col] == " ": # Only if it is a space
                            out_line = out_line[:col] + out_line[col+1:]
                            lshift += 1 # To re-align future changes on this line
                    except IndexError: # Reached and end of line, no biggie
                        pass
            out += out_line
            remove_columns = []
            out_line = ""
            lshift = 1
        else:
            out_line += token_string
        prev_toktype = token_type
        prev_token = tok
        last_col = end_col
        last_lineno = end_line
    # This makes sure to capture the last line if it doesn't end in a newline:
    out += out_line
    # The tokenize module doesn't recognize @ sign before a decorator
    return out

# NOTE: This isn't used anymore...  Just here for reference in case someone
# searches the internet looking for a way to remove similarly-styled end-of-line
# comments from non-python code.  It also acts as an edge case of sorts with
# that raw triple quoted string inside the "quoted_string" assignment.
def remove_comment(single_line):
    """
    Removes the comment at the end of the line (if any) and returns the result.
    """
    quoted_string = re.compile(
        r'''((?<!\\)".*?(?<!\\)")|((?<!\\)'.*?(?<!\\)')'''
    )
    # This divides the line up into sections:
    #   Those inside single quotes and those that are not
    split_line = quoted_string.split(single_line)
    # Remove empty items:
    split_line = [a for a in split_line if a]
    out_line = ""
    for section in split_line:
        if section.startswith("'") or section.startswith('"'):
            # This is a quoted string; leave it alone
            out_line += section
        elif '#' in section: # A '#' not in quotes?  There's a comment here!
            # Get rid of everything after the # including the # itself:
            out_line += section.split('#')[0]
            break # No reason to bother the rest--it's all comments
        else:
            # This isn't a quoted string OR a comment; leave it as-is
            out_line += section
    return out_line.rstrip() # Strip trailing whitespace before returning

def join_multiline_pairs(text, pair="()"):
    """
    Finds and removes newlines in multiline matching pairs of characters in
    'text'.  For example, "(.*\n.*), {.*\n.*}, or [.*\n.*]".

    By default it joins parens () but it will join any two characters given via
    the 'pair' variable.

    **Note:** Doesn't remove extraneous whitespace that ends up between the pair.
    Use reduce_operators() for that.

    Example:

    .. code-block:: python

        test = (
            "This is inside a multi-line pair of parentheses"
        )

    Will become:

    .. code-block:: python

        test = (            "This is inside a multi-line pair of parentheses"        )
    """
    # Readability variables
    opener = pair[0]
    closer = pair[1]

    # Tracking variables
    inside_pair = False
    inside_quotes = False
    inside_double_quotes = False
    inside_single_quotes = False
    quoted_string = False
    openers = 0
    closers = 0
    linecount = 0

    # Regular expressions
    opener_regex = re.compile('\%s' % opener)
    closer_regex = re.compile('\%s' % closer)

    output = ""

    for line in text.split('\n'):
        escaped = False
        # First we rule out multi-line strings
        multline_match = multiline_quoted_string.search(line)
        not_quoted_string_match = not_quoted_string.search(line)
        if multline_match and not not_quoted_string_match and not quoted_string:
            if len(line.split('"""')) > 1 or len(line.split("'''")):
                # This is a single line that uses the triple quotes twice
                # Treat it as if it were just a regular line:
                output += line + '\n'
                quoted_string = False
            else:
                output += line + '\n'
                quoted_string = True
        elif quoted_string and multiline_quoted_string.search(line):
            output += line + '\n'
            quoted_string = False
        # Now let's focus on the lines containing our opener and/or closer:
        elif not quoted_string:
            if opener_regex.search(line) or closer_regex.search(line) or inside_pair:
                for character in line:
                    if character == opener:
                        if not escaped and not inside_quotes:
                            openers += 1
                            inside_pair = True
                            output += character
                        else:
                            escaped = False
                            output += character
                    elif character == closer:
                        if not escaped and not inside_quotes:
                            if openers and openers == (closers + 1):
                                closers = 0
                                openers = 0
                                inside_pair = False
                                output += character
                            else:
                                closers += 1
                                output += character
                        else:
                            escaped = False
                            output += character
                    elif character == '\\':
                        if escaped:
                            escaped = False
                            output += character
                        else:
                            escaped = True
                            output += character
                    elif character == '"' and escaped:
                        output += character
                        escaped = False
                    elif character == "'" and escaped:
                        output += character
                        escaped = False
                    elif character == '"' and inside_quotes:
                        if inside_single_quotes:
                            output += character
                        else:
                            inside_quotes = False
                            inside_double_quotes = False
                            output += character
                    elif character == "'" and inside_quotes:
                        if inside_double_quotes:
                            output += character
                        else:
                            inside_quotes = False
                            inside_single_quotes = False
                            output += character
                    elif character == '"' and not inside_quotes:
                        inside_quotes = True
                        inside_double_quotes = True
                        output += character
                    elif character == "'" and not inside_quotes:
                        inside_quotes = True
                        inside_single_quotes = True
                        output += character
                    elif character == ' ' and inside_pair and not inside_quotes:
                        if not output[-1] in [' ', opener]:
                            output += ' '
                    else:
                        if escaped:
                            escaped = False
                        output += character
                if inside_pair == False:
                    output += '\n'
            else:
                output += line + '\n'
        else:
            output += line + '\n'

    # Clean up
    output = trailing_newlines.sub('\n', output)

    return output

def dedent(source):
    """
    Minimizes indentation to save precious bytes

    Example:

    .. code-block:: python

        def foo(bar):
            test = "This is a test"

    Will become:

    .. code-block:: python

        def foo(bar):
         test = "This is a test"
    """
    io_obj = cStringIO.StringIO(source)
    out = ""
    last_lineno = -1
    last_col = 0
    prev_start_line = 0
    indentation = ""
    indentation_level = 0
    for i,tok in enumerate(tokenize.generate_tokens(io_obj.readline)):
        token_type = tok[0]
        token_string = tok[1]
        start_line, start_col = tok[2]
        end_line, end_col = tok[3]
        if start_line > last_lineno:
            last_col = 0
        if token_type == tokenize.INDENT:
            indentation_level += 1
            continue
        if token_type == tokenize.DEDENT:
            indentation_level -= 1
            continue
        indentation = " " * indentation_level
        if start_line > prev_start_line:
            out += indentation + token_string
        elif start_col > last_col:
            out += " " + token_string
        else:
            out += token_string
        prev_start_line = start_line
        last_col = end_col
        last_lineno = end_line
    return out

def fix_empty_methods(source):
    """
    Appends 'pass' to empty methods/functions (i.e. where there was nothing but
    a docstring before we removed it =).

    Example:

    .. code-block:: python

        # Note: This triple-single-quote inside a triple-double-quote is also a
        # pyminifier self-test
        def myfunc():
            '''This is just a placeholder function.'''

    Will become:

    .. code-block:: python

        def myfunc(): pass
    """
    def_indentation_level = 0
    output = ""
    just_matched = False
    previous_line = None
    method = re.compile(r'^\s*def\s*.*\(.*\):.*$')
    for line in source.split('\n'):
        if len(line.strip()) > 0: # Don't look at blank lines
            if just_matched == True:
                this_indentation_level = len(line.rstrip()) - len(line.strip())
                if def_indentation_level == this_indentation_level:
                    # This method is empty, insert a 'pass' statement
                    output += "%s pass\n%s\n" % (previous_line, line)
                else:
                    output += "%s\n%s\n" % (previous_line, line)
                just_matched = False
            elif method.match(line):
                def_indentation_level = len(line) - len(line.strip()) # A commment
                just_matched = True
                previous_line = line
            else:
                output += "%s\n" % line # Another self-test
        else:
            output += "\n"
    return output

def remove_blank_lines(source):
    """
    Removes blank lines from 'source' and returns the result.

    Example:

    .. code-block:: python

        test = "foo"

        test2 = "bar"

    Will become:

    .. code-block:: python

        test = "foo"
        test2 = "bar"
    """
    io_obj = cStringIO.StringIO(source)
    source = [a for a in io_obj.readlines() if a.strip()]
    return "".join(source)

def minify(source):
    """
    Remove all docstrings, comments, blank lines, and minimize code
    indentation from 'source' then prints the result.
    """
    preserved_shebang = None
    preserved_encoding = None

    # This is for things like shebangs that must be precisely preserved
    for line in source.split('\n')[0:2]:
        # Save the first comment line if it starts with a shebang
        # (e.g. '#!/usr/bin/env python') <--also a self test!
        if shebang.match(line): # Must be first line
            preserved_shebang = line
            continue
        # Save the encoding string (must be first or second line in file)
        if encoding.match(line):
            preserved_encoding = line

    # Remove multilines (e.g. lines that end with '\' followed by a newline)
    source = multiline_indicator.sub('', source)

    # Remove docstrings (Note: Must run before fix_empty_methods())
    source = remove_comments_and_docstrings(source)

    # Remove empty (i.e. single line) methods/functions
    source = fix_empty_methods(source)

    # Join multiline pairs of parens, brackets, and braces
    source = join_multiline_pairs(source)
    source = join_multiline_pairs(source, '[]')
    source = join_multiline_pairs(source, '{}')

    # Remove whitespace between operators:
    source = reduce_operators(source)

    # Minimize indentation
    source = dedent(source)

    # Re-add preseved items
    if preserved_encoding:
        source = preserved_encoding + "\n" + source
    if preserved_shebang:
        source = preserved_shebang + "\n" + source

    # Remove blank lines
    source = remove_blank_lines(source).rstrip('\n') # Stubborn last newline

    return source

def bz2_pack(source):
    "Returns 'source' as a bzip2-compressed, self-extracting python script."
    import bz2, base64
    out = ""
    compressed_source = bz2.compress(source)
    out += 'import bz2, base64\n'
    out += "exec bz2.decompress(base64.b64decode('"
    out += base64.b64encode((compressed_source))
    out += "'))\n"
    return out

def gz_pack(source):
    "Returns 'source' as a gzip-compressed, self-extracting python script."
    import zlib, base64
    out = ""
    compressed_source = zlib.compress(source)
    out += 'import zlib, base64\n'
    out += "exec zlib.decompress(base64.b64decode('"
    out += base64.b64encode((compressed_source))
    out += "'))\n"
    return out

# The test.+() functions below are for testing pyminifer...
def test_decorator(f):
    """Decorator that does nothing"""
    return f

def test_reduce_operators():
    """Test the case where an operator such as an open paren starts a line"""
    (a, b) = 1, 2 # The indentation level should be preserved
    pass

def test_empty_functions():
    """
    This is a test method.
    This should be replaced with 'def empty_method: pass'
    """

class test_class(object):
    "Testing indented decorators"

    @test_decorator
    def foo(self):
        pass

def test_function():
    """
    This function encapsulates the edge cases to prevent them from invading the
    global namespace.
    """
    foo = ("The # character in this string should " # This comment
           "not result in a syntax error") # ...and this one should go away
    test_multi_line_list = [
        'item1',
        'item2',
        'item3'
    ]
    test_multi_line_dict = {
        'item1': 1,
        'item2': 2,
        'item3': 3
    }
    # It may seem strange but the code below tests our docstring removal code.
    test_string_inside_operators = imaginary_function(
        "This string was indented but the tokenizer won't see it that way."
    ) # To understand how this could mess up docstring removal code see the
      # remove_comments_and_docstrings() function starting at this line:
      #     "elif token_type == tokenize.STRING:"
    # This tests remove_extraneous_spaces():
    this_line_has_leading_indentation    = '''<--That extraneous space should be
                                              removed''' # But not these spaces

def main():
    usage = '%prog [options] "<input file>"'
    parser = OptionParser(usage=usage, version=__version__)
    parser.disable_interspersed_args()
    parser.add_option(
        "-o", "--outfile",
        dest="outfile",
        default=None,
        help="Save output to the given file.",
        metavar="<file path>"
    )
    parser.add_option(
        "--bzip2",
        action="store_true",
        dest="bzip2",
        default=False,
        help="bzip2-compress the result into a self-executing python script."
    )
    parser.add_option(
        "--gzip",
        action="store_true",
        dest="gzip",
        default=False,
        help="gzip-compress the result into a self-executing python script."
    )
    options, args = parser.parse_args()
    try:
        source = open(args[0]).read()
    except Exception, e:
        print e
        parser.print_help()
        sys.exit(2)
    # Minify our input script
    result = minify(source)
    # Compress it if we were asked to do so
    if options.bzip2:
        result = bz2_pack(result)
    elif options.gzip:
        result = gz_pack(result)
    # Either save the result to the output file or print it to stdout
    if options.outfile:
        f = open(options.outfile, 'w')
        f.write(result)
        f.close()
    else:
        print result

if __name__ == "__main__":
    main()

      

I wrote this so I could minimize the size of python code being run on embedded platforms (e.g. OpenWRT). minified + zipped modules can save a lot of space when applied to a large number of files. Here's an example of the ouput minifying itself:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
__version__='1.4.1'
__license__="GNU General Public License (GPL) Version 3"
__version_info__=(1,4,1)
__author__='Dan McDougall <YouKnowWho@YouKnowWhat.com>'
import sys,re,cStringIO,tokenize
from optparse import OptionParser
multiline_quoted_string=re.compile(r'(\'\'\'|\"\"\")')
not_quoted_string=re.compile(r'(\".*\'\'\'.*\"|\'.*\"\"\".*\')')
trailing_newlines=re.compile(r'\n\n')
shebang=re.compile('^#\!.*$')
encoding=re.compile(".*coding[:=]\s*([-\w.]+)")
multiline_indicator=re.compile('\\\\(\s*#.*)?\n')
double_quoted_string=re.compile(r'((?<!\\)".*?(?<!\\)")')
single_quoted_string=re.compile(r"((?<!\\)'.*?(?<!\\)')")
single_line_single_quoted_string=re.compile(r"((?<!\\)'''.*?(?<!\\)''')")
single_line_double_quoted_string=re.compile(r"((?<!\\)'''.*?(?<!\\)''')")
def remove_comments_and_docstrings(source):
 io_obj=cStringIO.StringIO(source)
 out=""
 prev_toktype=tokenize.INDENT
 last_lineno=-1
 last_col=0
 for tok in tokenize.generate_tokens(io_obj.readline):
  token_type=tok[0]
  token_string=tok[1]
  start_line,start_col=tok[2]
  end_line,end_col=tok[3]
  ltext=tok[4]
  if start_line>last_lineno:
   last_col=0
  if start_col>last_col:
   out+=(" "*(start_col-last_col))
  if token_type==tokenize.COMMENT:
   pass
  elif token_type==tokenize.STRING:
   if prev_toktype!=tokenize.INDENT:
    if prev_toktype!=tokenize.NEWLINE:
     if start_col>0:
      out+=token_string
  else:
   out+=token_string
  prev_toktype=token_type
  last_col=end_col
  last_lineno=end_line
 return out
def reduce_operators(source):
 io_obj=cStringIO.StringIO(source)
 remove_columns=[]
 out=""
 out_line=""
 prev_toktype=tokenize.INDENT
 prev_tok=None
 last_lineno=-1
 last_col=0
 lshift=1
 for tok in tokenize.generate_tokens(io_obj.readline):
  token_type=tok[0]
  token_string=tok[1]
  start_line,start_col=tok[2]
  end_line,end_col=tok[3]
  ltext=tok[4]
  if start_line>last_lineno:
   last_col=0
  if start_col>last_col:
   out_line+=(" "*(start_col-last_col))
  if token_type==tokenize.OP:
   start_of_line_types=[tokenize.NEWLINE,tokenize.DEDENT,tokenize.INDENT]
   if prev_toktype not in start_of_line_types:
    remove_columns.append(start_col)
    remove_columns.append(end_col+1)
  if token_string.endswith('\n'):
   out_line+=token_string
   if remove_columns:
    for col in remove_columns:
     col=col-lshift
     try:
      if out_line[col]==" ":
       out_line=out_line[:col]+out_line[col+1:]
       lshift+=1
     except IndexError:
      pass
   out+=out_line
   remove_columns=[]
   out_line=""
   lshift=1
  else:
   out_line+=token_string
  prev_toktype=token_type
  prev_token=tok
  last_col=end_col
  last_lineno=end_line
 out+=out_line
 return out
def remove_comment(single_line):
 quoted_string=re.compile( r'''((?<!\\)".*?(?<!\\)")|((?<!\\)'.*?(?<!\\)')'''
 )
 split_line=quoted_string.split(single_line)
 split_line=[a for a in split_line if a]
 out_line=""
 for section in split_line:
  if section.startswith("'")or section.startswith('"'):
   out_line+=section
  elif '#' in section:
   out_line+=section.split('#')[0]
   break
  else:
   out_line+=section
 return out_line.rstrip()
def join_multiline_pairs(text,pair="()"):
 opener=pair[0]
 closer=pair[1]
 inside_pair=False
 inside_quotes=False
 inside_double_quotes=False
 inside_single_quotes=False
 quoted_string=False
 openers=0
 closers=0
 linecount=0
 opener_regex=re.compile('\%s'%opener)
 closer_regex=re.compile('\%s'%closer)
 output=""
 for line in text.split('\n'):
  escaped=False
  multline_match=multiline_quoted_string.search(line)
  not_quoted_string_match=not_quoted_string.search(line)
  if multline_match and not not_quoted_string_match and not quoted_string:
   if len(line.split('"""'))>1 or len(line.split("'''")):
    output+=line+'\n'
    quoted_string=False
   else:
    output+=line+'\n'
    quoted_string=True
  elif quoted_string and multiline_quoted_string.search(line):
   output+=line+'\n'
   quoted_string=False
  elif not quoted_string:
   if opener_regex.search(line)or closer_regex.search(line)or inside_pair:
    for character in line:
     if character==opener:
      if not escaped and not inside_quotes:
       openers+=1
       inside_pair=True
       output+=character
      else:
       escaped=False
       output+=character
     elif character==closer:
      if not escaped and not inside_quotes:
       if openers and openers==(closers+1):
        closers=0
        openers=0
        inside_pair=False
        output+=character
       else:
        closers+=1
        output+=character
      else:
       escaped=False
       output+=character
     elif character=='\\':
      if escaped:
       escaped=False
       output+=character
      else:
       escaped=True
       output+=character
     elif character=='"' and escaped:
      output+=character
      escaped=False
     elif character=="'" and escaped:
      output+=character
      escaped=False
     elif character=='"' and inside_quotes:
      if inside_single_quotes:
       output+=character
      else:
       inside_quotes=False
       inside_double_quotes=False
       output+=character
     elif character=="'" and inside_quotes:
      if inside_double_quotes:
       output+=character
      else:
       inside_quotes=False
       inside_single_quotes=False
       output+=character
     elif character=='"' and not inside_quotes:
      inside_quotes=True
      inside_double_quotes=True
      output+=character
     elif character=="'" and not inside_quotes:
      inside_quotes=True
      inside_single_quotes=True
      output+=character
     elif character==' ' and inside_pair and not inside_quotes:
      if not output[-1]in[' ',opener]:
       output+=' '
     else:
      if escaped:
       escaped=False
      output+=character
    if inside_pair==False:
     output+='\n'
   else:
    output+=line+'\n'
  else:
   output+=line+'\n'
 output=trailing_newlines.sub('\n',output)
 return output
def dedent(source):
 io_obj=cStringIO.StringIO(source)
 out=""
 last_lineno=-1
 last_col=0
 prev_start_line=0
 indentation=""
 indentation_level=0
 for i,tok in enumerate(tokenize.generate_tokens(io_obj.readline)):
  token_type=tok[0]
  token_string=tok[1]
  start_line,start_col=tok[2]
  end_line,end_col=tok[3]
  if start_line>last_lineno:
   last_col=0
  if token_type==tokenize.INDENT:
   indentation_level+=1
   continue
  if token_type==tokenize.DEDENT:
   indentation_level-=1
   continue
  indentation=" "*indentation_level
  if start_line>prev_start_line:
   out+=indentation+token_string
  elif start_col>last_col:
   out+=" "+token_string
  else:
   out+=token_string
  prev_start_line=start_line
  last_col=end_col
  last_lineno=end_line
 return out
def fix_empty_methods(source):
 def_indentation_level=0
 output=""
 just_matched=False
 previous_line=None
 method=re.compile(r'^\s*def\s*.*\(.*\):.*$')
 for line in source.split('\n'):
  if len(line.strip())>0:
   if just_matched==True:
    this_indentation_level=len(line.rstrip())-len(line.strip())
    if def_indentation_level==this_indentation_level:
     output+="%s pass\n%s\n"%(previous_line,line)
    else:
     output+="%s\n%s\n"%(previous_line,line)
    just_matched=False
   elif method.match(line):
    def_indentation_level=len(line)-len(line.strip())
    just_matched=True
    previous_line=line
   else:
    output+="%s\n"%line
  else:
   output+="\n"
 return output
def remove_blank_lines(source):
 io_obj=cStringIO.StringIO(source)
 source=[a for a in io_obj.readlines()if a.strip()]
 return "".join(source)
def minify(source):
 preserved_shebang=None
 preserved_encoding=None
 for line in source.split('\n')[0:2]:
  if shebang.match(line):
   preserved_shebang=line
   continue
  if encoding.match(line):
   preserved_encoding=line
 source=multiline_indicator.sub('',source)
 source=remove_comments_and_docstrings(source)
 source=fix_empty_methods(source)
 source=join_multiline_pairs(source)
 source=join_multiline_pairs(source,'[]')
 source=join_multiline_pairs(source,'{}')
 source=reduce_operators(source)
 source=dedent(source)
 if preserved_encoding:
  source=preserved_encoding+"\n"+source
 if preserved_shebang:
  source=preserved_shebang+"\n"+source
 source=remove_blank_lines(source).rstrip('\n')
 return source
def bz2_pack(source):
 import bz2,base64
 out=""
 compressed_source=bz2.compress(source)
 out+='import bz2, base64\n'
 out+="exec bz2.decompress(base64.b64decode('"
 out+=base64.b64encode((compressed_source))
 out+="'))\n"
 return out
def gz_pack(source):
 import zlib,base64
 out=""
 compressed_source=zlib.compress(source)
 out+='import zlib, base64\n'
 out+="exec zlib.decompress(base64.b64decode('"
 out+=base64.b64encode((compressed_source))
 out+="'))\n"
 return out
def test_decorator(f):
 return f
def test_reduce_operators():
 (a,b)=1,2
 pass
def test_empty_functions():pass
class test_class(object):
 @test_decorator
 def foo(self):
  pass
def test_function():
 foo=("The # character in this string should " "not result in a syntax error")
 test_multi_line_list=['item1','item2','item3']
 test_multi_line_dict={'item1':1,'item2':2,'item3':3}
 test_string_inside_operators=imaginary_function("This string was indented but the tokenizer won't see it that way.")
 this_line_has_leading_indentation ='''<--That extraneous space should be
                                              removed'''
def main():
 usage='%prog [options] "<input file>"'
 parser=OptionParser(usage=usage,version=__version__)
 parser.disable_interspersed_args()
 parser.add_option("-o","--outfile",dest="outfile",default=None,help="Save output to the given file.",metavar="<file path>")
 parser.add_option("--bzip2",action="store_true",dest="bzip2",default=False,help="bzip2-compress the result into a self-executing python script.")
 parser.add_option("--gzip",action="store_true",dest="gzip",default=False,help="gzip-compress the result into a self-executing python script.")
 options,args=parser.parse_args()
 try:
  source=open(args[0]).read()
 except Exception,e:
  print e
  parser.print_help()
  sys.exit(2)
 result=minify(source)
 if options.bzip2:
  result=bz2_pack(result)
 elif options.gzip:
  result=gz_pack(result)
 if options.outfile:
  f=open(options.outfile,'w')
  f.write(result)
  f.close()
 else:
  print result
if __name__=="__main__":
 main()

The following is an example of the new self-extracting, compressed python script feature. It is the result of pyminifier compressing itself with the --gzip option:

import zlib, base64
exec zlib.decompress(base64.b64decode('eJzlGmuTm8jxu37FmI0L2EXYWruuUiqzvqp443LFXrtiJ1cpSUchGK04IyAM7MPn++/pngfMANI+crkvscsWmn4/pnum0dGTZw2rnq3T/BnNr0h5W2+LfHJEpsdTEhdJml/OSVNvpn/GlUkYXtGKpUUehoE981/6MxvWsjSmOaOwZr29+Ad5S3NaRRn51KwBQt4LKHHefnrvkn8KevLC0pil+aYAamfmvfRmLgCiBtSoUMibKCcf4jdFcxllGXn1r6L5W15c/7Qtfmwfo9qPi92ZPUl3ZVHVhN0yr6Je/LmuQP13H726+Erz9BudbKpiR4qyLqMKFJLoH8sadPiES9Vk12R1mqU5Df/dFDVNQsaZBBVFGWWaUaeynaWNf78vLfzr2u4kL+rDBJZ/LIjg0/ouPvAvLiODuopQ7GWY02sUz0wGy3yZAxbb0nVkMrd/Plo+8Y//BFCai4DpYBAgFhfzYLVkx85iurz2Vyeu5Wq2pnmSxlFdVAbnJfxxgObIP3Zfc/lJASE97Brn9asny6ULcl+rR7SPAeZBSktR2h2ljVpKSq7nA7jYOh97wOlOSw5ySuiGVHRXXNEQCHY0r1kY5QlwjQUr5rCiqWLqzickLcJi/UvQpqOvHhTOhBRNHVjWhJQVvQohW+vbkgYqa/13F2/OL75MSBaxmmufF8F0Jr/HRRY8n5BNUREgIGlOWrpLvg9rGvIV5ghF/IpGCXJB3QRyqOQtnq/aNekTXJ3hKqujSoj3xCNKRugpQilYz2H4oCAvEJLV9KbmX1/i13SjcTrTTEJtDJM6VFg4UxCOBv46CRyLWMdOizFVGK4raDXTOl/+5eOHD+BMzqWMGEPVs33In7/8/d3FW44LKHpwnvSjw5EOYF2c//T+3cW5QDMtey4XhVW687lyjHYm94DDdOEWTDQ/ynCoJZk9KloTyOK6qXLkLpM6aWIaFiUmTlE9MI3bLZE1u5wFi1WX2fDJJd4jzRU0uChQw0NZn7FtuqmD2f9h/nOqR26Cj584F0FVbERFRByIWT9j2+bpvznH+Hi9eK1GdgeBdoiRGBEgct1MFD8qS3BbZ4V7AEn692RmGCiC5QOQXaf1FnoXtKueq3qbB4lNEUI3zCRYQP3HwAgLuKd57om1urpVmxi4KpELQFsFAURIAbtt0OLMEelEJzmZzVcKXwg5CWZErNCbmJY1eZcn9Oa8qopKcZa1TJQJxW0y8CLfk8TcjUTbRka9Gffb/qKjIDTH9YdUoZ7Wg6Kkd1pHa+MY4r0tnFTQrUfPI99HzxqAPSGQVazMUukeg7fPAYZ4A3kR8dyJeOa3y5gQ0apXABGP0RjPnSb2XG57AfP5jhAZbdmW21HpENsapLrEUs3NPrK5GLE6jiutA1RXVEGyhkL5lYznRCugCxWH+BX6qnTEEemXIs3D7pRZRil0FCyHHj4GlgOxAMbQbaBQB7jGRcdZwdR3LL1pztJEkAd/jUCZdonHh/UW9aNdH6YfIFuYmUFyUSjFsBALffgj2hEXTV7jF4ESVvSS3pgH56fMfiqgriLfhyag4gxYymaJ6SFyB7oZuEvFRhU1yuKopIlSlaCLuYd3UR1vgz13GJ/RqIq3jsxcMrizSPLBep8QMsqUSODsy2v+HpYt3ICpc1VGc85aWWlZkNDu2YygF0yYBXvUckW2S4edBDwj0Td8dTSYWg7fi+xL1VC1eQwIt+Q+/lX7ZShqXEEuaq+L9EwzxGCv0tKrD9N2jtbctlEVxTVFMFFFR8hpQUEgRGpdDZWTidfG09iGXZMTW+eENxRBru1g6Vuix6KVKwFdtMgw2w9RcjdqVgjnPMqK1u2MI6p6EDiyHMAJpMXVS4TpBG1lpIzd4QbTD0qI5tf/vQPt5dLW3CdZPYr1uFL3SIeBTpbNQ9JTZq/YoY59jtBcf2eOSsfR3ALUsX40f5ADR1ugARpthA/ytPLLHVYYgn5vK0Zb9qPyZe9eN5XQMnLUkRr8gW58tAKmDx6ugE2MbMQSdIdColIK9ovpbJXmC2Diiaq2GsQYYEpyF9x71otxG7rs4gVTUEg+rVTZUg83d/342gfKQ9dg+uqzZs3PW57AcPWTbinvJQlN+H3kMZO+Q1MNfo/qhgS4lOYoKsIzNyfXvocZvaLtDDD15BSE5s2OTz+ce89D/qCByMMmIKMDDG3mNvCEbI9xkddpzjfKPi5ioDHOZTrkooeAWMcDkoFpvTh2YzyN9GQw77tj6gmihzR3zQi1ZOoeH3JB793IN+lNSHdlfRvuaL0tEn1OCPBwND21C84vDQjhV4OuFqCiadEwoaaY/Qnu5huGn5fsGGTA//7x0oF/7ly8/jCuTUKf/sXJuG2Ii6or568AMrTiVVYUlXqbshGTWk7qzutOB8xVKRt3SjDOuVflrKeMj3eW+VP4Zz11DE956lZm1F6N9k6ysWDIRBT+9zlQu9fsMUdZv88PhqC2iZlxV2OrYUm3hBk8wGNV3QLoWJWW06N1FuVfuYwHDrXFgzHd6RVO5rg45FHGrlotLMvHEUjLC9XZpXm6udVUAPvhVH+Fdz75dk/kfrfevtcTgMNpvng+P12pMZJgOIjfUKRyu1k1leADDFrdBAfprJE3i6Kj2l7fr/d7jdai7608Lcbo0OkBSJ69WNn3xPz1N1u3ZPzdSYtgnhcmcmrecyT6V+IPgSeY5CcC3KOXkRwllzCT2vT/yPZQdY2nVZvTkhxTef3tFPwRf9X3k3ilDhBvHTH6w8vuxIMVHDRiqI8QDVi+WjVOSHCy0xgRwUmd12Cn0xsaI8hPaEsvkPz1Dy9xMaGObUn0DsLdSB1noIqr5OLkqVdHuKmX3/ZY+i1L1/cwFdHuspWz2mMsp//DrIWLQB0iY57IzgbtlSibDmGQ74jmRN7aDWbeKdRo/k6iRRe7dtPkfHKM2Bwew4GDCQz+6EBlpXGNvH409eBHC6h+hcNoxnXqSVC8uSKAFzjWly0lR+bEC3sukdM8ti2aLCFwpsLbDjgJ9jniRITdQm+7IRTfsVjgLy6AlwHxMitLWR0s7LSmu5nt8c9T+fnCXg3xoQrWwa8Sfz5TFPNTRTN/8ZukkpNTeftp3Ruku+gyzaOqcyLa1xlzHTF5SKUJWTc1WErbl6EVuS5yuyaMQt9AUFQDwa3PbcNTCNdyG8EDdDUhv+3vJLBt+9V0+gWp6A3clnIKHZsw2BNUOXHdzbPu90dUngRYE9EZo1SErmHRJQ3sp2VVXJJFwX+ew1bEepXm0NTh8JnRM8vG/MKf7AT673ccQcv/9+SviwLtR0uuovKTlEV4qU/BXRUDJ+P+iCpoOR1OlCShEO9Y08LyrOkUtgjKt7wEQhVY2tdNBNHmLdrb0qwMrM/RFZUHEQgDj8ZlekVzboBvedDAoquoCqxXuAAy6+2ZtUf4dP0tLU8tL4rF/YNBRsA9Ds5QShOJoPTg5zipCAdNVRngirSpDopBssN+mmKdaWpMJPEbMMJiqP61v1elS2B7SCMBH1MIIf+VPjInPIxXIJXjH20E5RtZWX5xbOEgCO6zLj+3IY58mXrOP4Cfxw+UJWwmyHJeXQRjXAhRcyTCX5f59AaOW6e8KaLegXmom4jZMVfR585HvhK1bZriO6qRaejoGg1b9Z0WWUOVuYfYG2FhD+LZ19i4yca/rqDIdExghc+SuRPkOVqYLVAmICUM82iHP+kLrDDErRmG+BpbbNL/AKD1gyo='))

Here's the difference in file size:

$ ls -lh *pyminifier.py
-rw-r--r-- 1 riskable riskable 3.8K 2010-09-18 14:55 gzipped_minified_pyminifier.py
-rw-r--r-- 1 riskable riskable  11K 2010-09-18 15:02 minified_pyminifier.py

Tags: bz2, bzip2, comments, compression, docstring, embedded, gzip, minify, pack, regex, zlib

20 comments

Daniel Lepage 15 years ago # | flag

Discarding all comments removes the encoding. If the target script uses e.g. utf-8 with non-ascii characters the output will generate a syntax error.

Dan McDougall (author) 15 years ago # | flag

Good catch. I'll update the script to fix that (shouldn't be too hard).

Note: It also messes up quoted # signs like this:

Comment = "#"

...results in Comment = ". Still trying to figure out how to fix that.

Akira Fora 15 years ago # | flag

Good shot. You could save a few more bytes by : -transforming multiline instructions into single line instrustions. -removing spaces around operators. e.g. singleLineDocstring = ( method + (QuotedString(quoteChar='"', escChar='\', multiline=False) | \ QuotedString(quoteChar="'", escChar='\', multiline=False)) ) can be transformed into: singleLineDocstring=(method+(QuotedString(quoteChar='"',escChar='\',multiline=False)|QuotedString(quoteChar="'",escChar='\',multiline=False)))

Akira Fora 15 years ago # | flag

You can achieve further size reduction by obfuscating the code.
I didn't find any solid estimation of size reduction brought by obfuscation of Python code but Yahoo dev networks claims that, applied to CSS and Javascrip:
"In a survey of ten top U.S. web sites, minification achieved a 21% size reduction versus 25% for obfuscation." (http://developer.yahoo.com/performance/rules.html)
I guess the advantage of obfuscation over minification is even more significant applied to Python than applied to CSS and Javascrip, due to the respective syntax of these languages.
Though, obfuscors are more complex than minifiers, and obfuscated code is a pain if you have to debug it.

Dan McDougall (author) 15 years ago # | flag

Yeah, I thought about adding some obfuscating stuff to this but I changed my mind because it makes debugging nary impossible and when you're working on embedded platforms you often can't debug your code anywhere else but the device itself.

Also, I had originally planned to reduce spaces between operators but I never got around to it. It is in the TODO list =)

Dan McDougall (author) 15 years ago # | flag

UPDATE: It took FOREVER (and a lot of code) to work it all out but I've finally got this joining multiline pairs of parens, braces, and brackets (and removing unnecessary whitespace inside them). So something like this:

myvar = ('test',
    'test2',
)

Becomes...

myvar = ('test','test2',)

I googled and googled but I never did find another example of code that does that. Not even in another language. The hardest part was getting it to ignore things in triple double/single quotes and dealing with escaped characters (especially code that literally has something like "'\'").

Dan McDougall (author) 15 years ago # | flag

Almost forgot: I've also got it preserving shebangs (#!/usr/bin/env python) and encoding strings now (per Daniel Lepage's comment).

Dan McDougall (author) 15 years ago # | flag

Update (rev 10): Fixed some bugs where it wasn't working properly in certain (odd) situations. It also joins multi-lines (i.e. that end in '\') properly again.

jcaballero.hep 14 years, 9 months ago # | flag

It doesn't work when the original code includes lines like the following one:

print "this is an example # of line which fails"

pyminifier interprets everything after the char # as a comment and, therefore, the rest of the code is not well processed.

robert marshall 14 years, 7 months ago # | flag

If I give it a line with redundant brackets:

if (not result):

this gets converted to

if (notresult):

i.e. with no space between the 'not' and the variable

etatS evictA 14 years, 5 months ago # | flag

The problem mentioned by Robert Marshall can be fixed by changing

            elif character == ' ' and inside_pair and not inside_quotes:
                pass

            elif character == ' ' and inside_pair and not inside_quotes:
                if output[-1] != ' ':
                    output += ' '

or even better:

            elif character == ' ' and inside_pair and not inside_quotes:
                if not output[-1] in [' ', opener]:
                    output += ' '

jcaballero.hep mentions a problem that I haven't been able to fix and is a showstopper for me. It is some sort of error in the comment RegExp. I will be very glad if anyone would solve it.

Also, I didn't try this, but there will be trouble on """ ''' asd ''' """

(which could very reasonable appear e.g. in doctests)

Dan McDougall (author) 13 years, 11 months ago # | flag

Updated the code to fix all of the conditions in the comments. Now I need to work on fixing the mult-line joins.

Dan McDougall (author) 13 years, 11 months ago # | flag

Forget it, I must be crazy: Multi-line joins work fine they just look weird because the whitespace isn't being removed. I'll fix that next =)

Benoît Ryder 13 years, 8 months ago # | flag

When minifying the the following code:

def f():
  pass
  (a, b) = 1, 2
  pass

reduce_operators() will return:

def f():
  pass
 (a,b)=1,2
  pass

This will lead to an IdentationError in dedent() due to the space removed before (a,b).

Dan McDougall (author) 13 years, 8 months ago # | flag

@Benoit: Interesting edge case... When I run your (perfectly valid) example on my (python 2.6) install I get a syntax error from the tokenize module:

./pyminifier.py /tmp/test.py 
Traceback (most recent call last):
  File "./pyminifier.py", line 557, in <module>
    main()
  File "./pyminifier.py", line 550, in main
    print minify(source)
  File "./pyminifier.py", line 490, in minify
    source = dedent(source)
  File "./pyminifier.py", line 375, in dedent
    for i,tok in enumerate(tokenize.generate_tokens(io_obj.readline)):
  File "/usr/lib/python2.6/tokenize.py", line 346, in generate_tokens
    ("<tokenize>", lnum, pos, line))
  File "<tokenize>", line 3
    (a,b)=1,2
    ^
IndentationError: unindent does not match any outer indentation level

So this actually might be a bug in the tokenize module. I'll have to investigate further. Regardless, you can still get around this issue by using this syntax instead:

def f():
     pass
     a, b = 1, 2
     pass

...which will minify quite well.

Dan McDougall (author) 13 years, 7 months ago # | flag

@Benoit: I fixed that "indentation error when operators start a new line" but in version 1.4.1.

Gareth Rees 12 years, 9 months ago # | flag

I found four bugs in version 1.4.1.

Docstring recognition has false positives. Input:

def foo(): 'a' if True else 'd'

Output has a syntax error:

def foo():
 if True else 'd'

Under some circumstances, a function can end up being removed completely if it's the last function in the file. Input:

def bar(): 'docstring'

Output: nothing! (Off-by-one error in line handling?)

Blank lines get deleted in triple-quoted strings. Input:

s = """

Multi-line string with blank lines

"""

Output:

s="""
Multi-line string with blank lines
"""

Erroneous "pass" appended when function body is on same line as declaration. Input:

def baz(): return 1

Output:

def baz():return 1 pass

Dan McDougall (author) 12 years, 9 months ago # | flag

Thanks for reporting these bugs! Now, on to fixing them...

Steve Isaacson 11 years, 12 months ago # | flag

I found a bug in version 1.4.1 in the "fix_empty_methods()" function. Running the following example file though pyminifier:

# Add 2 numbers
def add(a,b): return a + b
# Subtract 2 numbers
def sub(a,b): return a - b
# Multiply 2 numbers
def mult(a,b): return a * b
# Divide 2 numbers
def div(a,b): return a / b
# Test functions
def test(a,b):
    ''' This is a placeholder function '''
def test2(a,b):
    """ This is another placeholder function """

produces:

def add(a,b):return a+b pass
def sub(a,b):return a-b
def mult(a,b):return a*b pass
def div(a,b):return a/b
def test(a,b):pass
def test2(a,b):

Here is a fix for the "fix_empty_methods()" function:

def fix_empty_methods(source):
    def_indentation_level = 0
    output = ""
    just_matched = False
    previous_line = None
    method = re.compile(r'^\s*def\s*.*\(.*\):$')
    for line in source.split('\n'):
        line = line.rstrip()
        if len(line.strip()) > 0: # Don't look at blank lines
            if just_matched == True:
                this_indentation_level = len(line.rstrip()) - len(line.strip())
                if def_indentation_level == this_indentation_level:
                    # This method is empty, insert a 'pass' statement
                    if previous_line:
                        output += "%s pass\n%s" % (previous_line, line)
                    else:
                        output += " pass\n%s" % line
                else:
                    if previous_line:
                        output += "%s\n%s" % (previous_line, line)
                    else:
                        output += "\n%s" % line
                just_matched = False
                # Check to see if the current line matches too
                if method.match(line):
                    def_indentation_level = len(line) - len(line.strip()) 
                    just_matched = True
                    # We already printed this line, so don't print again
                    previous_line = None
                else:
                    output += "\n"
            elif method.match(line):
                def_indentation_level = len(line) - len(line.strip()) # A commment
                just_matched = True
                previous_line = line
            else:
                output += "%s\n" % line # Another self-test
        else:
            if not (just_matched and not previous_line):
                output += "\n"
    return output

Produces:

def add(a,b):return a+b
def sub(a,b):return a-b
def mult(a,b):return a*b
def div(a,b):return a/b
def test(a,b):pass
def test2(a,b):pass

Piyush 9 years, 7 months ago # | flag

Hi,

I obfuscated my module, and the obfuscation is successful, great job there.

But I cannot use my module, once obfuscated.

I don't know if I'm making any sense or not.

But my requirement is to obfuscate my python module and deliver it to the customer, and the code should work without de-obfuscation.

Can you please comment on this?

Thanks in advance.

◄	Python recipes (4591)	►
◄	Dan McDougall's recipes (5)	►

Python code minifier (Python recipe) by Dan McDougall
ActiveState Code (http://code.activestate.com/recipes/576704/)

20 comments

Tags

Required Modules

Other Information and Tasks

Accounts

Code Recipes

Feedback & Information

ActiveState

Python code minifier (Python recipe) by Dan McDougall ActiveState Code (http://code.activestate.com/recipes/576704/)

20 comments

Tags

Required Modules

Other Information and Tasks

Accounts

Code Recipes

Feedback & Information

ActiveState

Python code minifier (Python recipe) by Dan McDougall
ActiveState Code (http://code.activestate.com/recipes/576704/)