Welcome, guest | Sign In | My Account | Store | Cart

simple readlines in reverse w/deque (Python recipe) by John Nielsen
ActiveState Code (http://code.activestate.com/recipes/496941/)

This a very simple implementation for how to do a readlines in reverse. It is not optimized for performance (which often doesn't matter). I have a 2nd version that is faster by loading blocks of data into memory instead of character by character. Of course, the code then almost doubles in size. And finally a third version that is the fastest, using split.

      import collections,cStringIO

def rev_readlines(arg):
    f=open(arg,'rb')
    f.seek(0,2)# go to the end
    line=collections.deque()
    while f.tell():    
        f.seek(-1,1)
        c=f.read(1)
        f.seek(-1,1)
        line.appendleft(c)
        if c =='\n':
            yield ''.join(line).strip()
            line.clear() #clear for next line
    yield ''.join(line).strip()

#bit of optimization, load groups of bytes from disk into memory
def rev_readlines2(arg,bufsize=8192):
    f1=open(arg,'rb')
    f1.seek(0,2)# go to the end
    leftover=''
    while f1.tell():
        print f1.tell()
        if f1.tell()<bufsize: bufsize=f1.tell()
        f1.seek(-bufsize,1)
        in_memory=f1.read(bufsize)+leftover
        f1.seek(-bufsize,1)
        buffer=cStringIO.StringIO(in_memory)
        buffer.seek(0,2)# go to the end
        line=collections.deque()
        while buffer.tell():
            buffer.seek(-1,1)
            c=buffer.read(1)
            buffer.seek(-1,1)
            line.appendleft(c)
            if c =='\n':
                yield ''.join(line).strip()
                line.clear()
        leftover=''.join(line).strip()
    yield leftover

#different approach and much faster
def rev_readlines3(arg,bufsize=8192):
    f1=open(arg,'rb')
    f1.seek(0,2)# go to the end
    leftover=''
    while f1.tell():
        if f1.tell()<bufsize: bufsize=f1.tell()
        f1.seek(-bufsize,1)
        in_memory=f1.read(bufsize)+leftover
        f1.seek(-bufsize,1)
        lines=in_memory.split('\n')
        for i in reversed(lines[1:]): yield i
        leftover=lines[0]
    yield leftover

for i in rev_readlines(filename): print i

      

The first 2 methods goes through a data file in reverse character by character and uses a deque to rebuild the string until a newline is found. Using a deque is clearer (appendleft) and also offers better performance.

The 3rd method, simply splits strings from a buffer as it goes backwards.

Tags: text

Created by John Nielsen on Fri, 4 Aug 2006 (PSF)

◄	Python recipes (4591)	►
◄	John Nielsen's recipes (36)	►

Required Modules

Other Information and Tasks

Licensed under the PSF License
Viewed 10906 times
Revision 5 (updated 17 years ago)

Accounts

Code Recipes

Feedback & Information

ActiveState

© 2024 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.

simple readlines in reverse w/deque (Python recipe) by John Nielsen ActiveState Code (http://code.activestate.com/recipes/496941/)