Generator with lookahead « Python recipes

Python generators are a great way of reducing memory usage due to the lazy (on demand) generation of values. However, when the process-time of generating such a value is relatively high, we can improve performance even more by obtaining the next n values of the generator in a separate thread in the background. Hence, the BackgroundGenerator.

      import Queue
import threading

class BackgroundGenerator(threading.Thread):
    def __init__(self, generator, lookahead=10):
        threading.Thread.__init__(self)
        self.queue = Queue.Queue(lookahead)
        self.generator = generator
        self.daemon = True
        self.start()

    def __iter__(self):
        return self

    def run(self):
        for item in self.generator:
            self.queue.put(item)
        self.queue.put(None)

    def next(self):
            next_item = self.queue.get()
            if next_item is None:
                 raise StopIteration
            return next_item



def test():
    import time

    def processing_releasing_GIL(process_time):
        # simulating processing that releases the Global Interpreter Lock
        # also see: https://wiki.python.org/moin/GlobalInterpreterLock
        time.sleep(process_time)

    def processing_no_release_GIL(n):
        i = 0
        while i < n:
           i += 1

    def generator_count_up_release_GIL(count, process_time):
        x = 1
        while x <= count:
            yield x
            processing_releasing_GIL(process_time)
            x += 1

    def generator_count_up_not_releasing_GIL(count, n):
        x = 1
        while x <= count:
            yield x
            processing_no_release_GIL(n)
            x += 1

    print "Testing runtime of a generator that releases the GIL."
    n = 10
    # test runtime without backgroundGenerator
    start = time.time()
    gen1 = generator_count_up_release_GIL(n, 0.1)
    counter = 0
    for i in gen1:
        counter = i
        processing_releasing_GIL(0.1)
    print "- no BackgroundGenerator:  ", counter, time.time() - start

    # test runtime WITH backgroundGenerator
    start = time.time()
    gen2 = BackgroundGenerator(generator_count_up_release_GIL(n, 0.1))
    counter = 0
    for i in gen2:
        counter = i
        processing_releasing_GIL(0.1)
    print "- with BackgroundGenerator:", counter, time.time() - start


    print "Testing runtime of a generator that does not release the GIL."
    iterations = 2000000
    # test runtime without backgroundGenerator
    start = time.time()
    gen1 = generator_count_up_not_releasing_GIL(n, iterations)
    counter = 0
    for i in gen1:
        counter = i
        processing_no_release_GIL(iterations)
    print "- no BackgroundGenerator:  ", counter, time.time() - start

    # test runtime WITH backgroundGenerator
    start = time.time()
    gen2 = BackgroundGenerator(generator_count_up_not_releasing_GIL(n, iterations))
    counter = 0
    for i in gen2:
        counter = i
        processing_no_release_GIL(iterations)
    print "- with BackgroundGenerator:", counter, time.time() - start


if __name__=="__main__":
    test()

      

The BackgroundGenerator produces a new generator from an existing generator by calling: newGenerator = BackgroundGenerator(oldGenerator)

In effect it prefetches the next 10 values (by default) produced by the generator in a separate thread. Note that this only improves performance if the generator itself releases the GIL. A typical example is a generator that opens a file, extracts some content, closes the file and yields the content. Since I/O releases the GIL, huge performance boost can be achieved.

When the GIL is not released in the wrapped generator, performance may actually decrease as demonstrated by http://www.dabeaz.com/python/UnderstandingGIL.pdf presentation.

Both scenario's are demonstrated in the test() method.

Tags: background, generator, generators, lookahead, performance, python

◄	Python recipes (4591)	►
◄	Rutger Saalmink's recipes (4)	►

Generator with lookahead (Python recipe) by Rutger Saalmink
ActiveState Code (http://code.activestate.com/recipes/578671/)

Tags

Required Modules

Other Information and Tasks

Accounts

Code Recipes

Feedback & Information

ActiveState

Generator with lookahead (Python recipe) by Rutger Saalmink ActiveState Code (http://code.activestate.com/recipes/578671/)