Welcome, guest | Sign In | My Account | Store | Cart

A wrapper class for (a small part of) the 're' module, that enables you to do re.match() or re.search() in an 'if' test or 'elif' test and use the result of the match after the test.

Python, 134 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
#!/usr/bin/env python
# -*- coding: iso-8859-1 -*-
'''
Using re.match, re.search, and re.group in if ... elif ... elif ... else ...
'''

__author__ = 'Peter Kleiweg'
__version__ = '1.4'
__date__ = '2005/11/16'

import re

class RE:
    '''
    Using re.match, re.search, and re.group in if ... elif ... elif ... else ...
    This is NOT thread safe

    Instance data:

        _pattern : pattern compiled by __init__()

    Global data:

        _match : match object saved by last match() or search()

    Example:

        rePat1 = RE(pattern1)
        rePat2 = RE(pattern2)
        for line in lines:
            if rePat1.search(line):
                grp1 = RE.group(1)
                grpA = RE.group('A')
            elif rePat2.search(line):
                grp2 = RE.group(2)
                grpB = RE.group('B')
    '''

    def __init__(self, pattern, flags=0):
        'do and save re.compile(pattern, flags)'
        self._pattern = re.compile(pattern, flags)

    def match(self, string, flags=0):
        'do, save, and return pattern.match(string, flags)'
        RE._match = self._pattern.match(string, flags)
        return RE._match

    def search(self, string, flags=0):
        'do, save, and return pattern.search(string, flags)'
        RE._match = self._pattern.search(string, flags)
        return RE._match

    def group(grp=0):
        'return match_object.group(grp)'
        return RE._match.group(grp)
    group = staticmethod(group)


class SR:
    '''
    Save and return value in bitwise or test
    This is thread safe

    Instance data:

        _ : value saved by __or__()

    Example:

        rePat1 = re.compile(pattern1)
        rePat2 = re.compile(pattern2)
        m = SR()
        for line in lines:
            if m|rePat1.search(line):
                grp1 = m.group(1)
                grpA = m.group('A')
            elif m|rePat2.search(line):
                grp2 = m.group(2)
                grpB = m.group('B')
    '''

    def __or__(self, value):
        'save value as _ and return value'
        self._ = value
        return value

    def group(self, grp=0):
        'return _.group(grp)'
        return self._.group(grp)



if __name__ == '__main__':

    lines = []
    lines.append(' 1     one   ')
    lines.append(' two   2     ')
    lines.append(' three three ')
    lines.append(' 4     4     ')

    reIntStr = RE(r'^\s*(?P<Int>\d+)\s+(?P<Str>\S.*?)\s*$')
    reStrInt = RE(r'^\s*(?P<Str>\S.*?)\s+(?P<Int>\d+)\s*$')
    for line in lines:
        print '>>>', line
        if reIntStr.search(line):
            print 'Int:', RE.group('Int')
            print 'Str:', RE.group('Str')
            print
        elif reStrInt.search(line):
            print 'Str:', RE.group('Str')
            print 'Int:', RE.group('Int')
            print
        else:
            print '*** UNMATCHED ***'
            print

    print "The same as above, now in a thread safe manner\n"

    reIntStr = re.compile(r'^\s*(?P<Int>\d+)\s+(?P<Str>\S.*?)\s*$')
    reStrInt = re.compile(r'^\s*(?P<Str>\S.*?)\s+(?P<Int>\d+)\s*$')
    m = SR()
    for line in lines:
        print '>>>', line
        if m|reIntStr.search(line):
            print 'Int:', m.group('Int')
            print 'Str:', m.group('Str')
            print
        elif m|reStrInt.search(line):
            print 'Str:', m.group('Str')
            print 'Int:', m.group('Int')
            print
        else:
            print '*** UNMATCHED ***'
            print

With the 're' module, you can't do this:

<pre> for line in lines: if re.search(patternA, line): do_something_with_match_data() elif re.search(patternB, line): do_something_with_match_data() </pre>

You can with 'RE'.

'RE' defines methods for only a small set of functions from the 're' module. You can add methods for more functions if you need them.

Why doesn't the 're' module save the last match object, and make it available? That would make this recipe unnecessary. Probably, because it wouldn't be thread safe. As I wrote 'RE', it isn't thread safe either. You can make it so by not using a static function and global data. I used a static function and a match object as global data on purpose. The way I wrote it, it is always clear that you are always using the match object of the last search or match. If you must access the match object through a class instance, you may accidently use the wrong one. A programming error that is easily overlooked:

<pre> if reIntStr.search(line): print 'Int:', reIntStr.group('Int') elif reStrInt.search(line): print 'Str:', reIntStr.group('Str') # oops, wrong match object! </pre>


New in version 1.2:

Class 'SR' let's you do the same in a thread safe manner.

2 comments

Peter Kleiweg (author) 18 years, 5 months ago  # | flag

Why not assignment in test? Of course, it would all be much simpler if Python syntax would allow this:

if m = re.search(patternA, line):
    match = m.group('A')
elif m = re.search(patternB, line):
    match = m.group('B')

Perhaps there should be a special assignment operator for use in if-tests?

Ian Bicking 18 years, 5 months ago  # | flag

No = in expressions. In Python it isn't allowed and probably will never be allowed to use assignment in expressions.

There are some other fiddly ways you can do it, maybe like:

class Value(object):
    def __init__(self):
        self.__dict__['_v'] = None
    def set(self, value):
        self._v = value
    def __getattr__(self, attr):
        return getattr(self._v, attr)
    def __setattr__(self, attr, value):
        setattr(self._v, attr, value)

v = Value()
for line in lines:
    if v.set(regex.search(line)):
        print v.group(1)
        # or if Value is less magic, and more explicit:
        print v._v.group(1)

These tricks are of dubious value, though; they introduce new idioms that are hard to justify in the long term.

Created by Peter Kleiweg on Tue, 15 Nov 2005 (PSF)
Python recipes (4591)
Peter Kleiweg's recipes (3)

Required Modules

Other Information and Tasks