Welcome, guest | Sign In | My Account | Store | Cart

This code extends the pickle module to enable pickling of functions and classes defined interactively at the command prompt. You can save the interpreter state by pickling the __main__ module and restore it later.

Python, 94 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
"""
Extend pickle module to allow pickling of interpreter state
including any interactively defined functions and classes.

This module is not required for unpickling such pickle files.

>>> import savestate, pickle, __main__
>>> pickle.dump(__main__, open('savestate.pickle', 'wb'), 2)
"""

import sys, pickle, new

def save_code(self, obj):
    """ Save a code object by value """
    args = (
        obj.co_argcount, obj.co_nlocals, obj.co_stacksize, obj.co_flags, obj.co_code,
        obj.co_consts, obj.co_names, obj.co_varnames, obj.co_filename, obj.co_name,
        obj.co_firstlineno, obj.co_lnotab, obj.co_freevars, obj.co_cellvars
    )
    self.save_reduce(new.code, args, obj=obj)
pickle.Pickler.dispatch[new.code] = save_code

def save_function(self, obj):
    """ Save functions by value if they are defined interactively """
    if obj.__module__ == '__main__' or obj.func_name == '<lambda>':
        args = (obj.func_code, obj.func_globals, obj.func_name, obj.func_defaults, obj.func_closure)
        self.save_reduce(new.function, args, obj=obj)
    else:
        pickle.Pickler.save_global(self, obj)
pickle.Pickler.dispatch[new.function] = save_function

def save_global_byname(self, obj, modname, objname):
    """ Save obj as a global reference. Used for objects that pickle does not find correctly. """
    self.write('%s%s\n%s\n' % (pickle.GLOBAL, modname, objname))
    self.memoize(obj)

def save_module_dict(self, obj, main_dict=vars(__import__('__main__'))):
    """ Special-case __main__.__dict__. Useful for a function's func_globals member. """
    if obj is main_dict:
        save_global_byname(self, obj, '__main__', '__dict__')
    else:
        return pickle.Pickler.save_dict(self, obj)      # fallback to original 
pickle.Pickler.dispatch[dict] = save_module_dict

def save_classobj(self, obj):
    """ Save an interactively defined classic class object by value """
    if obj.__module__ == '__main__':
        args = (obj.__name__, obj.__bases__, obj.__dict__)
        self.save_reduce(new.classobj, args, obj=obj)
    else:
        pickle.Pickler.save_global(self, obj, name)
pickle.Pickler.dispatch[new.classobj] = save_classobj

def save_instancemethod(self, obj):
    """ Save an instancemethod object """
    # Instancemethods are re-created each time they are accessed so this will not be memoized
    args = (obj.im_func, obj.im_self, obj.im_class)
    self.save_reduce(new.instancemethod, args)
pickle.Pickler.dispatch[new.instancemethod] = save_instancemethod

def save_module(self, obj):
    """ Save modules by reference, except __main__ which also gets its contents saved by value """
    if obj.__name__ == '__main__':
        self.save_reduce(__import__, (obj.__name__,), obj=obj, state=vars(obj).copy())
    elif obj.__name__.count('.') == 0:
        self.save_reduce(__import__, (obj.__name__,), obj=obj)
    else:
        save_global_byname(self, obj, *obj.__name__.rsplit('.', 1))
pickle.Pickler.dispatch[new.module] = save_module

def save_type(self, obj):
    if getattr(new, obj.__name__, None) is obj:
        # Types in 'new' module claim their module is '__builtin__' but are not actually there
        save_global_byname(self, obj, 'new', obj.__name__)
    elif obj.__module__ == '__main__':
        # Types in __main__ are saved by value

        # Make sure we have a reference to type.__new__        
        if id(type.__new__) not in self.memo:
            self.save_reduce(getattr, (type, '__new__'), obj=type.__new__)
            self.write(pickle.POP)

        # Copy dictproxy to real dict
        d = dict(obj.__dict__)
        # Clean up unpickleable descriptors added by Python
        d.pop('__dict__', None)
        d.pop('__weakref__', None)
        
        args = (type(obj), obj.__name__, obj.__bases__, d)
        self.save_reduce(type.__new__, args, obj=obj)
    else:
        # Fallback to default behavior: save by reference
        pickle.Pickler.save_global(self, obj)
pickle.Pickler.dispatch[type] = save_type

Functions and classes are normally pickled by reference using the module name and name of the object within the module. This cannot work, of course, for any functions and classes defined interactively at the Python prompt. This code allows functions and classes declared interactively at the prompt to be pickled by value. The Python bytecode and any other internal structures of objects are saved into the pickle stream. In order for this to work several other previously unpickleable data types had to be handled, too.

The Interactive Shell sample of the Google App Engine (deployed at http://shell.appspot.com ) tries to solve the same problem using a different technique: any statements producing unpickleable results are re-executed. However, this can result in unwanted side effects.

Limitations: Nested functions returned from within another function cannot be pickled.

6 comments

Oren Tirosh (author) 15 years, 11 months ago  # | flag

Monkeypatching. Sorry for doing this as a monkeypatch... It should be trivial to convert to a Pickler subclass, though.

Oren Tirosh (author) 15 years, 11 months ago  # | flag

Pickling cell variables. Armin Ronacher has a version which works around the nested function limitations using ctypes to create cell variables:

http://dev.pocoo.org/hg/sandbox/file/tip/pshell.py

Oren Tirosh (author) 15 years, 11 months ago  # | flag

Edit. Modified from originally posted version. Shorter, simpler and with less abuse of the pickle REDUCE operator.

Steven Bethard 15 years, 11 months ago  # | flag

Why use "pickle.Pickler.dispatch" instead of the copy_reg module as suggested in the documentation (http://docs.python.org/lib/node320.html)?

Oren Tirosh (author) 15 years, 11 months ago  # | flag

Why not copy_reg. The copy_reg dispatch table is checked after Pickler's internal dispatch table. You can't override the treatment of dicts, for example. I guess I could have used copy_reg for everything else but I prefer not to mix two different methods.

Eugen Duf 14 years, 8 months ago  # | flag

Hello - I'm getting an error: 'NoneType' object has no attribute 'update' from line 1224 in pickle.pyc: inst.__dict__.update(state) when I try to load the pickled state using

>>> pickle.load(fp)

I saved state exactly as specified in the doc. I'm using Python 2.5. Am I doing something obviously wrong? This should be a very useful for interactive work if I can get it working. Thanks

Eugene