The following code loads arbitrary pickles (well, there are probably some that it won't load, like ones which have object states which aren't dicts). It just loads their data into totally inert objects which you can then traverse and do what you like to. It's pretty similar to processing a DOM tree.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
import sys, pickle def makeFakeClass(module, name): class FakeThing(object): pass FakeThing.__name__ = name FakeThing.__module__ = '(fake)' + module return FakeThing class PickleUpgrader(pickle.Unpickler): def find_class(self, module, cname): # Pickle tries to load a couple things like copy_reg and # __builtin__.object even though a pickle file doesn't # explicitly reference them (afaict): allow them to be loaded # normally. if module in ('copy_reg', '__builtin__'): thing = pickle.Unpickler.find_class(self, module, cname) return thing return makeFakeClass(module, cname) root = PickleUpgrader(open(sys.argv)).load() # Do whatever to 'root' here.
While changing an app from using Pickle as its data format to a home-made XML dialect, I of course wanted to upgrade my old pickle data to the new format. The problem was, though, that I had drastically changed the classes in my code: the old pickles wouldn't come close to loading. Instead of trying to write special faked out classes on a class-by-class basis for my old objects, which would instantiate the new objects with the appropriate data, I decided to take a simpler, more central approach. So I'm going to use this code to traverse my pickle and write out the new-style XML.