Welcome, guest | Sign In | My Account | Store | Cart

Function to enable attribute access for generator instances. Simplifies data sharing for advanced uses of generators and provides much of the functionality sought by PEP 288. <br> Most uses of generators have no need for data sharing. This recipe is for the few tough cases which can be written more elegantly when attribute access is enabled.

Python, 56 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#### The recipe

from __future__ import generators
from inspect import getargspec, formatargspec

_redefinition = """
_redef_tmp = %(name)s
def %(name)s%(oldargs)s:
    wrapped = type('_GeneratorWrapper', (object,), %(name)s._realgen.__dict__)()
    wrapped.__iter__ = lambda self: self
    wrapped.next = %(name)s._realgen%(newargs)s.next
    return wrapped
%(name)s.__doc__ = _redef_tmp.__doc__
%(name)s._realgen = _redef_tmp
del _redef_tmp
"""

def enableAttributes(genfunc):
    """Wrapper for generators to enable classlike attribute access.

    The generator definition should specify 'self' as the first parameter.
    Calls to a wrapped generator should ignore the self parameter.
    """
    old = getargspec(genfunc)
    old[0].pop(0)
    new = getargspec(genfunc)
    new[0][0] = 'wrapped'
    specs = {'name': genfunc.func_name,
	     'oldargs': formatargspec(*old),
	     'newargs': formatargspec(*new)}
    exec(_redefinition % specs, genfunc.func_globals)


#### A minimal, complete example

def outputCaps(self, logfile):
    """Convert to uppercase and emit to stdout and logfile"""
    self.lineno = 1
    while True:
        logfile.write(self.line.upper())
        print self.prefix, self.line.upper(),
        yield None
        self.lineno += 1
outputCaps.prefix = 'Capitalized:'     # Make a class var style default
enableAttributes(outputCaps)           # Wrap the generator in a class

g = outputCaps(open('destfil.txt','w'))
for line in open('sourcefil.txt'):
    g.line = line.rstrip()  # Data can be passed into the generator
    g.next()
    print g.lineno          # Generators can also update the attributes

print dir(g)                # Still has __iter__() and next()
print outputCaps.__doc__    # Docstrings survive wrapping
print g.prefix              # Gen attributes carry through to instances
help(outputCaps)            # PyDoc produces an accurate help screen

Generators simplify writing iterators which produce data as needed rather than all at once. The yield keyword freezes execution state, eliminating the need for instance variables and progress flags. As well, generators automatically create the __iter__() and next() methods for the iterator interface.

Generators would also be useful for writing complex data consumer routines. Again, the ability to freeze and restart eliminates the code to explicitly save and restore the execution state between calls. However, it is difficult to cleanly pass data into consumers written using generators.

This recipe provides a simple way for generator instances to read and write attributes just like their class-based counterparts. To use the recipe, add a “self” parameter to the beginning of a generator definition. Then, call enableAttributes() to turn on attribute sharing.

After that, use attributes the same way you would with classes and instances. Writing complex consumers becomes trivially easy. Likewise, it simplifies implementation and increases the capabilities of other cookbook recipes using generators to create cooperative multitasking, co-routines, tasklets, continuations, finite state machines, and other wonders.

IMPLEMENTATION NOTES

When called, a generator definition creates a generator instance without attribute access. The enableAttributes routine wraps and replaces the generator definition with a new factory function that creates an attribute enabled class instance.

enableAttributes goes to great lengths to make sure the new factory function resembles the original generator (same name, same doc string, preserved function attributes, and an internal structure allowing autocompleters and pydoc to find the correct arguments).

When run, the factory function creates a generator instance using the original generator definition. The instance is embedded in a shell class instance whose function is to store attributes and to forward .next() calls to the generator instance.

UNDERSTANDING THE DETAILED WORKINGS OF THE EXAMPLE

The key to understanding enableAttributes() is to study what is passed to exec() in the code sample: <pre> _redef_tmp = outputCaps #1 def outputCaps(logfile): #2 wrapped = type('_GeneratorWrapper', (), outputCaps._realgen.__dict__)() #3 wrapped.__iter__ = lambda self: self #4 wrapped.next = outputCaps._realgen(wrapped, logfile).next #5 return wrapped outputCaps.__doc__ = _redef_tmp.__doc__ #6 outputCaps._realgen = _redef_tmp #7 del _redef_tmp #8 </pre>

1 the original generator is saved.

2 a new factory function is defined in its place (note, the parameter list excludes self so that pydoc will show the generator correctly).

'wrapped' starts out as an instance of an empty class whose dictionary is the same as the original generator.

3 'wrapped' is an instance of an empty class that shares the same function dictionary as the original generator (this enables behavior like class variables).

4 to support the iterator protocol and match generator behavior, the instance is given an __iter__ method which returns self.

5 the original generator is called and the resulting generator-iterator is used to handle calls to the next() method.

6 the new factory function gets a docstring from the original generator definition.

7 the reference to the original generator definition is stored in the new factory function's attribute dictionary. this completes the wrapping.

8 the temporary variable is cleaned from the workspace. this is important because exec() runs the redefinition code with the globals from the workspace of the original generator definition. that workspace is left unchanged except for repointing the generator reference to the new factory function.

6 comments

Alex Naanou 21 years, 2 months ago  # | flag

a class version with almost the same semantics..... but quite a bit more flexible!!!

# would it not be easier to do the following:

class Generator(object):
    '''
    this is a generator class.
    '''
    def __call__(self):
        self.counter = 0
        while self.counter &lt; 10:
            self.counter += 1
            yield self.counter

gen = Generator()

for i in gen():
    print i

# yes this adds the extra step of class instantiation but it removes
# the need for outside-class parameter definition. and the advantage
# is that this method allows for transparent way to create not only
# parameters but special methods to manipulate internal generator
# state.


# Imagine a singleton generator:

class SingletonGenerator(object):
    '''
    this generator singleton class counts to ten and then resets.
    '''
    __state = 0
    def __call__(self):
        while self.__state &lt; 10:
            self.__state += 1
            yield self.__state
        self.__state = 0
    def rest(self):
        self.__state = 0
    def __getattr__(self, name):
        return getattr(self.__class__, name)
    def __setattr__(self, name, val):
        return setattr(self.__class__, name, val)

# usage:

gen_a = SingletonGenerator()
gen_b = SingletonGenerator()

for i, j in zip(gen_a(), gen_b())
    print i, j,

## output:
# 1 2 3 4 5 6 7 8 9 10
Alex Naanou 21 years, 2 months ago  # | flag

some might call what was done in the previous comment "the borg" pattern, I see them (borg and singleton) to be one and the same, they differ only in the side you chose to look from...

Christopher Dunn 18 years, 10 months ago  # | flag

A comparative example is found in 'Hackers and Painters' Anyone interested in this recipe might also like to read a recent book, "Hackers and Painters: Big Ideas from the Computer Age", by Paul Graham (2004).

Though the book is mostly a non-technical discussion of programming principles and practices (generally lionizing Lisp) there is an appendix (pp. 195-99) which compares the __call__ version above to even simpler solutions in other languages.

Though his example is just a generated function with maintained state, rather than a "generator" (ie a function with a "yield" call), it explains exactly why some people appreciate other, more powerful languages.

I disagree with him: The Python example is only slightly more complicated, and it is vastly easier for a non-expert user to comprehend. The implicitly lexical variables do not exactly aid readability.

However, Graham is not unkind to Python, and this book is definitely worth reading for anyone with an interest in the utility of programming languages.

I'm pointing that out because this recipe has caused some big ideas to gel in my mind. After studying this complicated recipe, I suddenly "get" what Python is doing with functions. Tes, the version in the comment is definitely the way I would code it in practice, but I'm glad to see the complex one.

In case you don't have the book, here are his examples. Python first:

class foo:
  def __init__(self, n): self.n = n
  def __call__(self, i): self.n +=1; return self.n

Obviously, this could yield self.n instead.

The Java solution is a pain, but in Javascript:

function foo(n) {
  return function(i) {
    return n += 1 }}

Smalltalk has strange syntax, but at least has the transparency of the Python version:

foo: n
  |s|
  s := n.
  ^[:i| s := s+i. ]

Perl has an implicit return statement:

sub foo {
  my ($n) = @_;
  sub {$n += shift}
}

Ruby is the clearest, but with strange delimiters:

def foo (n)
  lambda {|i| n += i} end

And finally, Lisp is easily the least cluttered:

(defun foo(n)
  (lambda (i) (incf n i)))

Graham claims that this cannot be done in C++, but of course C++ has its own analogue of __call__, operator():

struct foo {
  int n;
  foo(int N): n(N) {}
  int operator()(int i) {
    return n += i;
  }
};

// BUT BEWARE:
#include
int main() {
  foo f(7);
  foo g(30);
  cerr &lt;&lt; f(2)  &lt;&lt; ' ' &lt;&lt; g(2) &lt;&lt; ' ' &lt;&lt; f(2)
       &lt;&lt; ' ' &lt;&lt; g(2)  &lt;&lt; " hike!" &lt;&lt; endl;
}
// Prints: 11 34 9 32 hike!
// (if you replace the stream insertion operators,
//  which had formatting problems)
// The stream-operator associates left-to-right,
// but it is evaluated right to left!

and this demonstrates the inherent dangers of this sort of programming.

(comment continued...)

Christopher Dunn 18 years, 10 months ago  # | flag

(...continued from previous comment)

The author also lists 2 other Python solutions which work, and 2 which do not work:

def foo(n):
  class accumulator:
    def __init__(self, s):
      self.s = s
    def inc(self, i):
      self.s += i
      return self.s
  return acc(n).inc

# and

def foo(n):
  s = [n] # The list holds a copy of n
  def bar(i):
    s[0] += i
    return s[0]
  return bar

And these 2 do not work:

def foo(n):
  return lambda i: return n += 1

# nor

def foo(n):
  lambda i: n += 1

The author guesses that Python will one day allow something like these, but only the Python class-based version allows easy access to n. So in fact, what to some is a flaw in Python can be a feature to others.

I hope somebody learns as much from these examples as I have. Functional programming suddenly makes much more sense to me.

Christopher Dunn 18 years, 10 months ago  # | flag

Graham's examples are not 'statically typed' The C++ and Java versions work only on integers, which is what Graham meant to point out. But a template is possible. Anyway, the point of the present recipe is to have access into the function, and the Python version is quite nice that way.

Flávio Codeço Coelho 18 years, 10 months ago  # | flag

no comments. very nice recipe!