Closures are powerful. Closures are beautiful. But: Closures are TRICKY!
This is an anti-recipe, a caveat about some obscure pitfalls with closures - or the way they are implemented in python.
And now for the caveats: Two no-no's...
Don't create more then one instance of the same closure per normal function!
Don't create more then one instance of the same closure per generation cycle in a generator function!
Here is why...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
''' According to my expectations, the three functions in this code below should all produce the same result. I've been stunned to discover that this is not the case. ''' def test1(): for i in range(5): def call(): return i yield call def test2(): all =  for i in range(5): def call(): return i all.append(call) return all def test3(): def MakeCall(i): def call(): return i return call all =  for i in range(5): all.append(MakeCall(i)) return all print for test in [ test1, test2, test3 ]: print test.__name__, ':', [ f() for f in test() ] expected_output = ''' test1 : [0, 1, 2, 3, 4] test2 : [0, 1, 2, 3, 4] test3 : [0, 1, 2, 3, 4] ''' actual_output = ''' test1 : [0, 1, 2, 3, 4] test2 : [4, 4, 4, 4, 4] # <= this is the stunning thing !!! test3 : [0, 1, 2, 3, 4] '''
I've been using closures in all kinds of situations, and until recently, they always used to behave in line with my expectations. The above result really stunned me... What the hack was going on here??? It seems that all closure instances constructed within test2 are closing on the value of i as it is at the end of the loop, regardless of their construction time and context. Now THAT was definitely NOT what I expected...
According to my understanding of closures, I would expect that they are bound to a kind of snapshot of their lexical environment at the moment they are defined. I.e., for the code below I would expect f(3) to return a function (a closure) that always returns 7. <pre> def f(x): x2 = 2 * x + 1 def g(): return x return g </pre> This seems quite natural to me, as this is in congruence with my general understanding of how interpreters work:
Each time the interpreter executes f, and encounters def g, it creates a new code block by compiling the body of g and binds it to the local name 'g'. At the same time, since the body of g references x2, the value of x2 is not discarded by the garbage collector even after the scope of f is left. It remains intact as long as g - the only instance to reference it - lives. So far, closures seem a natural consequence of the principal code interpretation and compilation process.
However, it turned out - and I learned it the hard way - that things are not that simple.
As it seems, a closure is not bound to a snapshot of its lexical environment at the moment it is defined, but rather at the end of the defining function. Or, to be more precise: At points where the surrounding lexical scope is left. This includes return and yield statements. Hence, wrapping up the construction of the closure in a "closure factory function", as exemplified in test3, retreats to the expected outcome again.
I have no clue whether this behavior is by definition or a result of an over-optimization. I incline to the latter. But be it intentional or not, the matter is worth rethinking, because one thing is certain:
If wrapping up a piece of code into a function - a broadly used refactoring known as "Extract Method" - modifies the behavior of the code, then (that) refactoring is not behavior-preserving any more. That is, it is not a refactoring at all...
Well, I don't know about you, but I don't like the taste of that.
Cheers and happy pitfall avoiding - to all of us.