Welcome, guest | Sign In | My Account | Store | Cart

Yet another way to read a file line by line, starting at the end.

Python, 54 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#!/usr/bin/env python
# -*-mode: python; coding: iso-8859-1 -*-
#
# Copyright (c) Peter Astrand <astrand@cendio.se>

import os
import string

class BackwardsReader:
    """Read a file line by line, backwards"""
    BLKSIZE = 4096

    def readline(self):
        while 1:
            newline_pos = string.rfind(self.buf, "\n")
            pos = self.file.tell()
            if newline_pos != -1:
                # Found a newline
                line = self.buf[newline_pos+1:]
                self.buf = self.buf[:newline_pos]
                if pos != 0 or newline_pos != 0 or self.trailing_newline:
                    line += "\n"
                return line
            else:
                if pos == 0:
                    # Start-of-file
                    return ""
                else:
                    # Need to fill buffer
                    toread = min(self.BLKSIZE, pos)
                    self.file.seek(-toread, 1)
                    self.buf = self.file.read(toread) + self.buf
                    self.file.seek(-toread, 1)
                    if pos - toread == 0:
                        self.buf = "\n" + self.buf

    def __init__(self, file):
        self.file = file
        self.buf = ""
        self.file.seek(-1, 2)
        self.trailing_newline = 0
        lastchar = self.file.read(1)
        if lastchar == "\n":
            self.trailing_newline = 1
            self.file.seek(-1, 2)

# Example usage
br = BackwardsReader(open('bar'))

while 1:
    line = br.readline()
    if not line:
        break
    print repr(line)

I know there are several recipes already, but I didn't like them, so I wrote my own implementation.

3 comments

Raymond Hettinger 15 years, 11 months ago  # | flag

Simplifying code transformations. * Converted to a generator

  • Use string methods instead of string module

  • trailing_newline set with a single test

  • Nested if-statements collapsed to if/elif/else

  • Replace var!=0 comparison with simple boolean test

  • Import of os module was unused

    def BackwardsReader(file, BLKSIZE = 4096): """Read a file line by line, backwards"""

    buf = ""
    file.seek(-1, 2)
    lastchar = file.read(1)
    trailing_newline = (lastchar == "\n")
    
    while 1:
        newline_pos = buf.rfind("\n")
        pos = file.tell()
        if newline_pos != -1:
            # Found a newline
            line = buf[newline_pos+1:]
            buf = buf[:newline_pos]
            if pos or newline_pos or trailing_newline:
                line += "\n"
            yield line
        elif pos:
            # Need to fill buffer
            toread = min(BLKSIZE, pos)
            file.seek(-toread, 1)
            buf = file.read(toread) + buf
            file.seek(-toread, 1)
            if pos == toread:
                buf = "\n" + buf
        else:
            # Start-of-file
            return
    

    Example usage

    for line in BackwardsReader(open('brent.txt')): print repr(line)

Kevin German 11 years, 1 month ago  # | flag

Minor change to Hettinger's post to get around limitations with io.seek in Python v3.1 The above fails with:

IOError: can't do nonzero end-relative seeks at
5: file.seek(-1, 2)

and

IOError: can't do nonzero cur-relative seeks at 34: file.seek(-toread, 1) 36: file.seek(-toread, 1)

<code>

def reverseReadFile(file, BLKSIZE = 4096): """Read a file line by line, backwards""" buf = ""

if( not file.seekable() ):
    return

file.seek(0, 2)
lastchar = file.read(1)
trailing_newline = (lastchar == "\n")

while 1:
    newline_pos = buf.rfind("\n")
    pos = file.tell()
    if newline_pos != -1:
        # Found a newline
        line = buf[newline_pos+1:]
        buf = buf[:newline_pos]
        if pos or newline_pos or trailing_newline:
            line += "\n"
        yield line
    elif pos:
        # Need to fill buffer
        toread = min(BLKSIZE, pos)
        file.seek(pos-toread, 0)
        buf = file.read(toread) + buf
        file.seek(pos-toread, 0)
        if pos == toread:
            buf = "\n" + buf
    else:
        # Start-of-file
        return

</code>

Elcimar Leandro 10 years, 8 months ago  # | flag

Thank you Peter, Raymond and Kevin!

Created by Peter Astrand on Thu, 11 Aug 2005 (PSF)
Python recipes (4591)
Peter Astrand's recipes (1)

Required Modules

Other Information and Tasks