This recipe demonstrates how you can do regular expression matching and replacing with operators.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | import re
class REstr(str):
cache = {}
def __div__(self, regex):
try:
reg = REstr.cache[regex]
except KeyError:
REstr.cache[regex] = reg = re.compile(regex)
self.sre = reg.search(self)
return REstr(self.sre.group())
def __idiv__(self, tpl):
try:
regex, repl, count = tpl
except ValueError:
regex, repl = tpl
count = 0
try:
reg = REstr.cache[regex]
except KeyError:
REstr.cache[regex] = reg = re.compile(regex)
return REstr(reg.sub(repl, self, count))
def __call__(self, g):
return self.sre.group(g)
if __name__ == '__main__':
a = REstr('abcdebfghbij')
print "a :", a
print "Match a / 'b(..)(..)' :",
print a / 'b(..)(..)' # find match
print "a[0], a[1], a[2] :",
print a[0], a[1], a[2] # print letters from string
print "a(0), a(1), a(2) :",
print a(0), a(1), a(2) # print matches
print "a :", a
a /= 'b.', 'X', 1 # find and replace once
print "a :", a
a /= 'b.', 'X' # find and replace all
print "a :", a
|
Migrating from Perl to Python, my biggest pain is missing the easy way Perl let's you use regular expressions.
Take this example in Perl: <pre> $a =~ s/b./X/g; </pre> The equivalent in Python looks like: <pre> a = re.sub('b.', 'X', a) </pre> I designed a class derived from class 'str' that adds some methods to operators not used in class 'str'. With this, the example above can be rewritten as: <pre> a /= 'b.', 'X' </pre> Similarly, I defined the operator '/' for simple matching.
Extracting submatches is a bit verbose in Python. In Perl you can have this: <pre> $a =~ s/b(..)(..)/; $all = $&; $submatch1 = $1; $submatch2 = $2; </pre> Normally, this looks in Python like: <pre> sre = re.search('b(..)(..)', a) all = sre.group() submatch1 = sre.group(1) submatch2 = sre.group(2) </pre> With my example class it can be done like this: <pre> a / 'b(..)(..)' all = a(0) submatch1 = a(1) submatch2 = a(2) </pre>
This class is just a demonstration. It only implements a few basic methods for using regular expression. Other methods could be added as needed, but I think it would become messy if you tried to make all of Python's re-module accessible through this class.
If you need to do a search and replace on many lines of text, you may not want to use this class. The class compiles and stores regular expressions for reuse, but it is still much slower than repeated calling of a compiled regular expression directly.