A wrapper class for (a small part of) the 're' module, that enables you to do re.match() or re.search() in an 'if' test or 'elif' test and use the result of the match after the test.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | #!/usr/bin/env python
# -*- coding: iso-8859-1 -*-
'''
Using re.match, re.search, and re.group in if ... elif ... elif ... else ...
'''
__author__ = 'Peter Kleiweg'
__version__ = '1.4'
__date__ = '2005/11/16'
import re
class RE:
'''
Using re.match, re.search, and re.group in if ... elif ... elif ... else ...
This is NOT thread safe
Instance data:
_pattern : pattern compiled by __init__()
Global data:
_match : match object saved by last match() or search()
Example:
rePat1 = RE(pattern1)
rePat2 = RE(pattern2)
for line in lines:
if rePat1.search(line):
grp1 = RE.group(1)
grpA = RE.group('A')
elif rePat2.search(line):
grp2 = RE.group(2)
grpB = RE.group('B')
'''
def __init__(self, pattern, flags=0):
'do and save re.compile(pattern, flags)'
self._pattern = re.compile(pattern, flags)
def match(self, string, flags=0):
'do, save, and return pattern.match(string, flags)'
RE._match = self._pattern.match(string, flags)
return RE._match
def search(self, string, flags=0):
'do, save, and return pattern.search(string, flags)'
RE._match = self._pattern.search(string, flags)
return RE._match
def group(grp=0):
'return match_object.group(grp)'
return RE._match.group(grp)
group = staticmethod(group)
class SR:
'''
Save and return value in bitwise or test
This is thread safe
Instance data:
_ : value saved by __or__()
Example:
rePat1 = re.compile(pattern1)
rePat2 = re.compile(pattern2)
m = SR()
for line in lines:
if m|rePat1.search(line):
grp1 = m.group(1)
grpA = m.group('A')
elif m|rePat2.search(line):
grp2 = m.group(2)
grpB = m.group('B')
'''
def __or__(self, value):
'save value as _ and return value'
self._ = value
return value
def group(self, grp=0):
'return _.group(grp)'
return self._.group(grp)
if __name__ == '__main__':
lines = []
lines.append(' 1 one ')
lines.append(' two 2 ')
lines.append(' three three ')
lines.append(' 4 4 ')
reIntStr = RE(r'^\s*(?P<Int>\d+)\s+(?P<Str>\S.*?)\s*$')
reStrInt = RE(r'^\s*(?P<Str>\S.*?)\s+(?P<Int>\d+)\s*$')
for line in lines:
print '>>>', line
if reIntStr.search(line):
print 'Int:', RE.group('Int')
print 'Str:', RE.group('Str')
print
elif reStrInt.search(line):
print 'Str:', RE.group('Str')
print 'Int:', RE.group('Int')
print
else:
print '*** UNMATCHED ***'
print
print "The same as above, now in a thread safe manner\n"
reIntStr = re.compile(r'^\s*(?P<Int>\d+)\s+(?P<Str>\S.*?)\s*$')
reStrInt = re.compile(r'^\s*(?P<Str>\S.*?)\s+(?P<Int>\d+)\s*$')
m = SR()
for line in lines:
print '>>>', line
if m|reIntStr.search(line):
print 'Int:', m.group('Int')
print 'Str:', m.group('Str')
print
elif m|reStrInt.search(line):
print 'Str:', m.group('Str')
print 'Int:', m.group('Int')
print
else:
print '*** UNMATCHED ***'
print
|
With the 're' module, you can't do this:
<pre> for line in lines: if re.search(patternA, line): do_something_with_match_data() elif re.search(patternB, line): do_something_with_match_data() </pre>
You can with 'RE'.
'RE' defines methods for only a small set of functions from the 're' module. You can add methods for more functions if you need them.
Why doesn't the 're' module save the last match object, and make it available? That would make this recipe unnecessary. Probably, because it wouldn't be thread safe. As I wrote 'RE', it isn't thread safe either. You can make it so by not using a static function and global data. I used a static function and a match object as global data on purpose. The way I wrote it, it is always clear that you are always using the match object of the last search or match. If you must access the match object through a class instance, you may accidently use the wrong one. A programming error that is easily overlooked:
<pre> if reIntStr.search(line): print 'Int:', reIntStr.group('Int') elif reStrInt.search(line): print 'Str:', reIntStr.group('Str') # oops, wrong match object! </pre>
New in version 1.2:
Class 'SR' let's you do the same in a thread safe manner.
Why not assignment in test? Of course, it would all be much simpler if Python syntax would allow this:
Perhaps there should be a special assignment operator for use in if-tests?
No = in expressions. In Python it isn't allowed and probably will never be allowed to use assignment in expressions.
There are some other fiddly ways you can do it, maybe like:
These tricks are of dubious value, though; they introduce new idioms that are hard to justify in the long term.