Welcome, guest | Sign In | My Account | Store | Cart

This is a basic re-implementation of fileinput using generators. It supports all basic functionality that the library module has (nextfile(), lineno(), filelineno(), close(), and filename()). It also adds an __iter__() method that is a generator.

Python, 119 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
from __future__ import generators
"""Provides a class called LineIterator that iterates over every line in a list of files.

Performing the basic functionality as fileinput module in the standard 
library (which mimics Perl's <> operator).  Basically acts as a simple 
and methodical way to iterate over every line in a list of files.  If 
not files are specified at instance creation, then sys.argv[1:] is used.
If that is empty, then sys.stdin is used.  sys.stdin can be specified in 
the list of files by listing '-' as a file.

Lacking functionality, compared to the fileinput module, is in-place 
editing and subsequently backup.  The module functions that are included 
in fileinput are left out here for space concerns.  readline() has also 
been left out since the generator can just have its .next() method called.

Dedicated to my grandmother, Mary Alice Renshaw (1911/12/13-2002/01/13).

"""
import sys


__author__ = 'Brett Cannon'
__email__ = 'drifty@bigfoot.com'
__version__ = '1.0'


class LineIterator(object):
    """Basic  reimplementation of fileinput.FileInput using generators.

    Passed in files are iterated over (file by file, line by line), and
    returned by the iterator.  Use of sys.stdin is specified  by '-'.
    if no files are specified, sys.argv[1:] is used.  If that is empty,
    sys.stdin is used.

    Flags values (| values together; set by various methods):
    1 :: Close current file
    2 :: End generator (thus 3 closes the current file and ends the
         generator)

    """

    def __init__(self, file_list=sys.argv[1:]):
        """Set all instance variables.

        If no files are specified (either passed in or from sys.argv[1:]),
        then sys.stdin is then used by making the only file '-'.

        """
        if file_list:
            if isinstance(file_list,str):
                self.__file_list = [file_list]
            else:
                self.__file_list = file_list
        else:
            self.__file_list = list('-')
        self.__current_file=''
        self.__relative_cnt = 0
        self.__absolute_cnt = 0
        self.__flags = 0

    def nextfile(self):
        """Starts looping over next file."""
        self.__flags = 1

    def close(self):
        """Ends the generator.

        Does this by setting __flag to both close the current file and
        end the generator.

        """
        self.__flags = 3

    def filename(self):
        """Returns name of current file."""
        if self.__current_file == '-':
            return 'sys.stdin'
        else:
            return self.__current_file

    def lineno(self):
        """Returns accumulative line total thus far."""
        return self.__absolute_cnt

    def filelineno(self):
        """Returns line total for the current file thus far."""
        return self.__relative_cnt

    def __iter__(self):
        """Generator for looping over every line in the files in self.files.

        If __flag is set to &2, then end the generator.  Otherwise check if
        '-' (read: sys.stdin) is the next file.  Set __relative_cnt to 0 and
        start iterating.

        If __flag is set to &1, close the current file and break the loop.
        Else see if the file name equals the current file name.  If not,
        change it.  Then increment both __relative_cnt and __absolute_cnt.
        Finally, yield the line.

        """
        for file_location in self.__file_list:
            if self.__flags & 2:
                return
            if file_location == '-':
                FILE = sys.stdin
            else:
                FILE = open(file_location,'r')
            self.__relative_cnt = 0
            for line in FILE:
                if self.__flags & 1:
                    FILE.close()
                    break
                if file_location != self.__current_file:
                    self.__current_file = file_location
                self.__relative_cnt += 1
                self.__absolute_cnt += 1
                yield line
            else: FILE.close()

fileinput itself simulates Perl's <>. This object simplifies having to do line-by-line text processing on a list of files (usually from sys.argv[1:]) or sys.stdin.

There should be a performance difference between this implementation and the one in the standard library. It also is a nice way of showing how to use generators when you need to modify there output between calls to next().

1 comment

Éric Araujo 14 years, 1 month ago  # | flag

Hello

How would this recipe benefit from Python’s new io module?

Regards