Welcome, guest | Sign In | My Account | Store | Cart

This recipe helps find cyclical references in Python code to a] optimize so the garbage collector doesn't have to work as hard, and b] deal with uncollectable objects, such as those with a __del__ method, or extension objects that don't participate in garbage collection.

Python, 61 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
import gc
from types import FrameType

def print_cycles(objects, outstream=sys.stdout, show_progress=False):
    """
    objects:       A list of objects to find cycles in.  It is often useful
                   to pass in gc.garbage to find the cycles that are
                   preventing some objects from being garbage collected.
    outstream:     The stream for output.
    show_progress: If True, print the number of objects reached as they are
                   found.
    """
    def print_path(path):
        for i, step in enumerate(path):
            # next "wraps around"
            next = path[(i + 1) % len(path)]

            outstream.write("   %s -- " % str(type(step)))
            if isinstance(step, dict):
                for key, val in step.items():
                    if val is next:
                        outstream.write("[%s]" % repr(key))
                        break
                    if key is next:
                        outstream.write("[key] = %s" % repr(val))
                        break
            elif isinstance(step, list):
                outstream.write("[%d]" % step.index(next))
            elif isinstance(step, tuple):
                outstream.write("[%d]" % list(step).index(next))
            else:
                outstream.write(repr(step))
            outstream.write(" ->\n")
        outstream.write("\n")
    
    def recurse(obj, start, all, current_path):
        if show_progress:
            outstream.write("%d\r" % len(all))

        all[id(obj)] = None

        referents = gc.get_referents(obj)
        for referent in referents:
            # If we've found our way back to the start, this is
            # a cycle, so print it out
            if referent is start:
                print_path(current_path)

            # Don't go back through the original list of objects, or
            # through temporary references to the object, since those
            # are just an artifact of the cycle detector itself.
            elif referent is objects or isinstance(referent, FrameType): 
                continue

            # We haven't seen this object before, so recurse
            elif id(referent) not in all:
                recurse(referent, start, all, current_path + [obj])

    for obj in objects:
        outstream.write("Examining: %r\n" % obj)
        recurse(obj, obj, { }, [])

Even with Python's garbage collector, it is still possible to leak memory by creating cyclical references, if those objects have __del__ methods or are extension objects that don't participate in garbage collection. (Tkinter, for example, includes objects with __del__ methods, and prompted the creation of this recipe). These cycles can be very hard to find in large bodies of code.

Example usage:

>>> ... do something that leaks objects ...
>>> print_cycles(gc.garbage)
Examining:
    --  ->
    -- ['_master'] ->
    --  ->

This shows that there is a Tkinter.StringVar object that through its _master member references a NavigationToolbar2TkAgg object, which in turn references the Tkinter.StringVar object. Since Tkinter.StringVar has a __del__ method, the garbage collector could not collect this cycle. This code must be modified so one or both of these references is manually destroyed.

1 comment

a 11 years, 2 months ago  # | flag

all should be a set not a dict, since you are not using the values at all