Welcome, guest | Sign In | My Account | Store | Cart
31

Decorator for automatic code optimization. If a global is known at compile time, replace it with a constant. Fold tuples of constants into a single constant. Fold constant attribute lookups into a single constant.

Python, 180 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
from opcode import opmap, HAVE_ARGUMENT, EXTENDED_ARG
globals().update(opmap)

def _make_constants(f, builtin_only=False, stoplist=[], verbose=False):
    try:
        co = f.func_code
    except AttributeError:
        return f        # Jython doesn't have a func_code attribute.
    newcode = map(ord, co.co_code)
    newconsts = list(co.co_consts)
    names = co.co_names
    codelen = len(newcode)

    import __builtin__
    env = vars(__builtin__).copy()
    if builtin_only:
        stoplist = dict.fromkeys(stoplist)
        stoplist.update(f.func_globals)
    else:
        env.update(f.func_globals)

    # First pass converts global lookups into constants
    i = 0
    while i < codelen:
        opcode = newcode[i]
        if opcode in (EXTENDED_ARG, STORE_GLOBAL):
            return f    # for simplicity, only optimize common cases
        if opcode == LOAD_GLOBAL:
            oparg = newcode[i+1] + (newcode[i+2] << 8)
            name = co.co_names[oparg]
            if name in env and name not in stoplist:
                value = env[name]
                for pos, v in enumerate(newconsts):
                    if v is value:
                        break
                else:
                    pos = len(newconsts)
                    newconsts.append(value)
                newcode[i] = LOAD_CONST
                newcode[i+1] = pos & 0xFF
                newcode[i+2] = pos >> 8
                if verbose:
                    print name, '-->', value
        i += 1
        if opcode >= HAVE_ARGUMENT:
            i += 2

    # Second pass folds tuples of constants and constant attribute lookups
    i = 0
    while i < codelen:

        newtuple = []
        while newcode[i] == LOAD_CONST:
            oparg = newcode[i+1] + (newcode[i+2] << 8)
            newtuple.append(newconsts[oparg])
            i += 3

        opcode = newcode[i]
        if not newtuple:
            i += 1
            if opcode >= HAVE_ARGUMENT:
                i += 2
            continue

        if opcode == LOAD_ATTR:
            obj = newtuple[-1]
            oparg = newcode[i+1] + (newcode[i+2] << 8)
            name = names[oparg]
            try:
                value = getattr(obj, name)
            except AttributeError:
                continue
            deletions = 1

        elif opcode == BUILD_TUPLE:
            oparg = newcode[i+1] + (newcode[i+2] << 8)
            if oparg != len(newtuple):
                continue
            deletions = len(newtuple)
            value = tuple(newtuple)

        else:
            continue

        reljump = deletions * 3
        newcode[i-reljump] = JUMP_FORWARD
        newcode[i-reljump+1] = (reljump-3) & 0xFF
        newcode[i-reljump+2] = (reljump-3) >> 8

        n = len(newconsts)
        newconsts.append(value)
        newcode[i] = LOAD_CONST
        newcode[i+1] = n & 0xFF
        newcode[i+2] = n >> 8
        i += 3
        if verbose:
            print "new folded constant:", value

    codestr = ''.join(map(chr, newcode))
    codeobj = type(co)(co.co_argcount, co.co_nlocals, co.co_stacksize,
                    co.co_flags, codestr, tuple(newconsts), co.co_names,
                    co.co_varnames, co.co_filename, co.co_name,
                    co.co_firstlineno, co.co_lnotab, co.co_freevars,
                    co.co_cellvars)
    return type(f)(codeobj, f.func_globals, f.func_name, f.func_defaults,
                    f.func_closure)

_make_constants = _make_constants(_make_constants) # optimize thyself!

def bind_all(mc, builtin_only=False, stoplist=[],  verbose=False):
    """Recursively apply constant binding to functions in a module or class.

    Use as the last line of the module (after everything is defined, but
    before test code).  In modules that need modifiable globals, set
    builtin_only to True.

    """
    try:
        d = vars(mc)
    except TypeError:
        return
    for k, v in d.items():
        if type(v) is FunctionType:
            newv = _make_constants(v, builtin_only, stoplist,  verbose)
            setattr(mc, k, newv)
        elif type(v) in (type, ClassType):
            bind_all(v, builtin_only, stoplist, verbose)

@_make_constants
def make_constants(builtin_only=False, stoplist=[], verbose=False):
    """ Return a decorator for optimizing global references.

    Replaces global references with their currently defined values.
    If not defined, the dynamic (runtime) global lookup is left undisturbed.
    If builtin_only is True, then only builtins are optimized.
    Variable names in the stoplist are also left undisturbed.
    Also, folds constant attr lookups and tuples of constants.
    If verbose is True, prints each substitution as is occurs

    """
    if type(builtin_only) == type(make_constants):
        raise ValueError("The bind_constants decorator must have arguments.")
    return lambda f: _make_constants(f, builtin_only, stoplist, verbose)

## --------- Example call -----------------------------------------

import random

@make_constants(verbose=True)
def sample(population, k):
    "Choose k unique random elements from a population sequence."
    if not isinstance(population, (list, tuple, str)):
        raise TypeError('Cannot handle type', type(population))
    n = len(population)
    if not 0 <= k <= n:
        raise ValueError, "sample larger than population"
    result = [None] * k
    pool = list(population)
    for i in xrange(k):         # invariant:  non-selected at [0,n-i)
        j = int(random.random() * (n-i))
        result[i] = pool[j]
        pool[j] = pool[n-i-1]   # move non-selected item into vacancy
    return result

""" Output from the example call:

list --> <type 'list'>
tuple --> <type 'tuple'>
str --> <type 'str'>
TypeError --> exceptions.TypeError
type --> <type 'type'>
len --> <built-in function len>
ValueError --> exceptions.ValueError
list --> <type 'list'>
xrange --> <type 'xrange'>
int --> <type 'int'>
random --> <module 'random' from 'C:\PYTHON24\lib\random.pyc'>
new folded constant: (<type 'list'>, <type 'tuple'>, <type 'str'>)
new folded constant: <built-in method random of Random object at 0x00A281E8>
"""

When hand optimizing functions, it is typical to include lines like _len=len for looking-up the len function at compile time and binding to a local constant. The speed-up is substantial (trading one or more dictionary lookups for a single C-speed array lookup).

make_constants() performs those optimizations automatically; leaving the original code free of clutter like _len=len.

In the example, observe replacements of "random" from the global namespace and "len", "ValueError", "list", "xrange", and "int" from the builtin namespace. All of those are recoded as constants. If "random" had not already been imported at binding time, then it would have been left as a runtime global lookup.

Constant binding is especially useful for:

  • Functions with many references to global constants (sre_compile for example).
  • Functions relying on builtins (such as len(), True, and False).
  • Functions that raise or catch exceptions (such as IndexError).
  • Recursive functions (to optimize the function's reference to itself).
  • Functions that call other functions defined in the same module.
  • Functions accessing names imported into the global namespace (i.e. from random import random).

After binding globals to constants, the decorator makes a second pass and folds constant attribute lookups and tuples of constants. The first frequently occurs with module references (such as string.hexdigits). The second commonly occurs in for-loops and conditionals using the "in" operator such as: for x in ('a','b','c').

Binding should be applied selectively to those functions where speed is important and dynamic updates are not desired (i.e. the globals do not change). In more dynamic environments, a more conservative approach is to set builtin_only to True so that only the builtins get optimized (this includes functions like len(), exceptions like IndexError, and constants like True or False). Alternatively, individual variable names can be added to a stoplist so that the function knows to leave them unchanged.

Note, make_constants() works like other function decorators such as classmethod() and staticmethod(), so it can be used inside a class definition to optimize methods.

For versions of Python before Py2.4, the @decorator syntax was not available. To invoke the decoration, do so after the function is defined with:

 sample = _make_constants(sample, verbose=True)

Also, due to the way decorators work, the latter form must be used to optimize recursive function calls.

Instead of decorating every function and method in a module, the bind_all() utility function is provided. Call bind_all(my_class) after the class definition or put bind_all(sys.modules[__name__]) as the last line in a module to be optimized.

12 comments

skip 12 years, 3 months ago  # | flag

Very cool ... One suggestion. This is a very cool decorator Raymond. I do have one small suggestion. Instead of hard-coding the various bytecode numbers, I'd import symbols from the opcode module just to future-proof it a little:

from opcode import opmap, HAVE_ARGUMENT, EXTENDED_ARG
LOAD_GLOBAL = opmap['LOAD_GLOBAL']
LOAD_CONST = opmap['LOAD_CONST']

Skip

Raymond Hettinger (author) 12 years, 3 months ago  # | flag

Done. Incorporated the suggested change. Thanks!

S W 12 years, 3 months ago  # | flag

Exceedingly Cool. I have a voxel rendering class, with several nested loops. It needs a very tight inner loop, and as such I have 17 _len = len style optimisations to avoid excessive dict lookups.

Thanks Raymond, this is brilliant stuff, and has tidied up my code very nicely indeed!

Thomas Heller 11 years, 10 months ago  # | flag

compile time? Since the constants are bound when the module is imported, the recipe should be named "BindingConstants at import time" or something like that, imo.

Stefan Behnel 11 years, 7 months ago  # | flag

Metaclass makes things more handy. Here's a metaclass that calls make_constants for all methods of a class.

class OptimizingMetaclass(type):
    def __init__(cls, name, bases, dict):
        super(OptimizingMetaclass, cls).__init__(name, bases, dict)

        import types
        for name, attribute in dict.items():
            if type(attribute) is types.FunctionType:
                dict[name] = _make_constants(attribute)

I know, decorators make you think about the right place to put them - but if it doesn't break the code, there should be nothig bad about optimizing everywhere.

BTW, if you still want to keep it configurable, try:

def build_optimizing_metaclass(builtin_only=False, stoplist=[], verbose=False):
    from types import FunctionType
    class _OptimizingMetaclass(type):
        def __init__(cls, name, bases, dict):
            super(_OptimizingMetaclass, cls).__init__(name, bases, dict)

            for name, attribute in dict.items():
                if type(attribute) is FunctionType:
                    dict[name] = _make_constants(attribute, builtin_only, stoplist, verbose)

    return _OptimizingMetaclass

__metaclass__ = build_optimizing_metaclass(verbose=True)
Stefan Behnel 11 years, 7 months ago  # | flag

bind_all misses imports. bind_all misses

from types import FunctionType, ClassType

and, I guess, bind_all simplifies my last comment to

class OptimizingMetaclass(type):
    def __init__(cls, name, bases, dict):
        super(OptimizingMetaclass, cls).__init__(name, bases, dict)
        bind_all(cls)

def build_optimizing_metaclass(builtin_only=False, stoplist=[], verbose=False):
    class _OptimizingMetaclass(type):
        def __init__(cls, name, bases, dict):
            super(_OptimizingMetaclass, cls).__init__(name, bases, dict)
            bind_all(cls, builtin_only, stoplist, verbose)

    return _OptimizingMetaclass
paul cannon 10 years, 10 months ago  # | flag

getting into closures as well. I use closures fairly frequently, so I've found it useful to add code to this to also prebind constants in code objects in func_code.co_consts and functions under closures.

The code objects part is pretty simple; just split out the part of _make_constants that deals only with the code object into a separate function (say, code_make_constants) which takes the code object co, the environment dict env, stoplist, and verbose as arguments and do

newconsts = [(isintance(c, CodeType)
               and code_make_constants(c, env, stoplist, verbose)
               or c)
             for c in co.co_consts]

somewhere up at the top.

Binding the constants on functions in closures is a little more tricky, but it can be done using the get_cell_value and change_cell_value functions from some recipes I've posted.

I have this code block in _make_constants, before the function object is created:

for cell in (f.func_closure or []):
    value = get_cell_value(cell)
    if isinstance(value, FunctionType):
        change_cell_value(
            cell,
            _make_constants(value, builtin_only, stoplist, verbose)
        )

For those that wonder why I would do such a thing: Optimizing code objects in .co_consts is useful when you have subfunctions, i.e.

def foo(n):
    def bar():
        return n
    return bar

bar will show up as a code object in foo.func_code.co_consts. Optimizing functions in a func_closure tuple is useful when you use wrapper functions in decorators, i.e.:

def the_decorator(f):
    def wrapper(*args, **kwargs):
        result = f(*args, **kwargs)
        return do_something_with(result)
    return wrapper

@the_decorator
def baz():
    return do_stuff()

With this pattern, the original baz code will not be accessible except in a cell in baz.func_closure.

Christos Georgiou 10 years, 10 months ago  # | flag

Another side effect. I used to wrap all the functions and classes in a "main" function, which returned a dictionary of names to be exported for two reasons:

  1. speed (LOAD_DEREF was faster than LOAD_GLOBAL)

  2. less namespace clutter (ie all the imported modules did not show up in the module's globals)

The first part is handled by the recipe. The second part can easily be dealt with by deleting the imported modules after running the bind_all function (assuming that all LOAD_GLOBALS have been replaced with LOAD_CONST, of course :)

Miki Tebeka 7 years, 1 month ago  # | flag

There's a typo in sample function, ininstance should be isinstance

Miguel Angel Rasero 7 years, 1 month ago  # | flag

I think i have not get the stoplist usage, i have tried this:

@binding_constants.make_constants(verbose=True, stoplist=[util,])

but still i get util module optimized, so i have introduced in the decorator this print for debug.

if verbose: print name, stoplist, name in stoplist

and i get this:

util [<module 'util' from 'util.pyc'>] False

util --> <module 'util' from 'util.pyc'>

why is the util module still optimized?

Thanks.

Gabriel Genellina 6 years, 10 months ago  # | flag

See recipe 576904 for a Python 3 compatible version.

David Cortesi 7 months ago  # | flag

I am bothered by this code:

    if opcode in (EXTENDED_ARG, STORE_GLOBAL):
        return f    # for simplicity, only optimize common cases

If the function being optimized contains a STORE_GLOBAL, the process bails and does nothing to that function. But the implication of finding a STORE_GLOBAL go beyond the one function. Say the module has a statement SOME_GLOB = 0 and we apply build_all(). In any function in the module that lacks a STORE_GLOBAL opcode, any reference to SOME_GLOB will be optimized to a reference to constant 0 instead. Then this function contains global SOME_GLOB; SOME_GLOB = 99. So this function is not optimized -- but all the others are! In effect, this function's assignment to SOME_GLOB is nullified. "I know I'm setting the global but the other functions act like it was never set..."

It seems to me that if any function in a module contains STORE_GLOBAL, then that global name needs to never be optimized. What am I not getting?

Add a comment

Sign in to comment