Welcome, guest | Sign In | My Account | Store | Cart

The grep() function is inspired by UNIX grep, but is not limited to string patterns and files or streams.

Python, 20 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
def grep(*matches):
    """Returns a generator function that operates on an iterable:
        filters items in the iterable that match any of the patterns.

    match: a callable returning a True value if it matches the item

    >>> import re
    >>> input = ["alpha\n", "beta\n", "gamma\n", "delta\n"]
    >>> list(grep(re.compile('b').match)(input))
    ['beta\n']
    """
    def _do_grep_wrapper(*matches):
        def _do_grep(lines):
            for line in lines:
                for match in matches:
                    if match(line):
                        yield line
                        break
        return _do_grep
    return _do_grep_wrapper(*matches)

grep(*matches) returns a generator function, which can be called with any iterator or sequence, such as a list or a file object. The matches are callables which take an item as input and return a True value if that item is a match. If at least one of the matches returns a True value for an item, the item is included in the generator output. The input items don't necessarily have to be strings.

To pick lines containing the string "foo" out of an input file:

f = file(...)
grepper = grep(lambda line: "foo" in line) # a function
matched = grepper(f) # a generator object

Match functions can be arbitrarily complex.

3 comments

John 10 years, 1 month ago  # | flag

I can't make this work. How could I see the line from the file? I saved your file as mygrep.py. import mygrep,sys,os,re

f=['family asdf a asdf \n asdfas asdf asd asdf asdf asd \n asdfasfd'];

grepper=mygrep.grep(lambda line: "family" in line) matched=grepper(f)

Andy Dustman (author) 10 years, 1 month ago  # | flag

matched is a generator/iterator, so loop over it do something like:

print list(matched)

Additionally, your example only has one item (it's a list with a single string in it) which would be matched. Try something like:

f = 'family asdf a asdf \n asdfas asdf asd asdf asdf asd \n asdfasfd'.split('\n')

It's also possible to do things like this:

sys.stdout.writelines(grep(lambda line: "family" in line)(sys.stdin))
John 10 years, 1 month ago  # | flag

Thanks Andy.

Created by Andy Dustman on Wed, 5 Mar 2014 (MIT)
Python recipes (4591)
Andy Dustman's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks