Welcome, guest | Sign In | My Account | Store | Cart

This recipe demonstrates how you can do regular expression matching and replacing with operators.

Python, 49 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import re

class REstr(str):

    cache = {}

    def __div__(self, regex):
        try:
            reg = REstr.cache[regex]
        except KeyError:
            REstr.cache[regex] = reg = re.compile(regex)
        self.sre = reg.search(self)
        return REstr(self.sre.group())
        
    def __idiv__(self, tpl):
        try:
            regex, repl, count = tpl
        except ValueError:
            regex, repl = tpl
            count = 0
        try:
            reg = REstr.cache[regex]
        except KeyError:
            REstr.cache[regex] = reg = re.compile(regex)
        return REstr(reg.sub(repl, self, count))

    def __call__(self, g):
        return self.sre.group(g)

if __name__ == '__main__':
    a = REstr('abcdebfghbij')
    print "a :", a

    print "Match a / 'b(..)(..)' :",
    print a / 'b(..)(..)'               # find match

    print "a[0], a[1], a[2] :",
    print a[0], a[1], a[2]              # print letters from string

    print "a(0), a(1), a(2) :",
    print a(0), a(1), a(2)              # print matches

    print "a :", a

    a /= 'b.', 'X', 1                   # find and replace once
    print "a :", a

    a /= 'b.', 'X'                      # find and replace all
    print "a :", a

Migrating from Perl to Python, my biggest pain is missing the easy way Perl let's you use regular expressions.

Take this example in Perl: <pre> $a =~ s/b./X/g; </pre> The equivalent in Python looks like: <pre> a = re.sub('b.', 'X', a) </pre> I designed a class derived from class 'str' that adds some methods to operators not used in class 'str'. With this, the example above can be rewritten as: <pre> a /= 'b.', 'X' </pre> Similarly, I defined the operator '/' for simple matching.

Extracting submatches is a bit verbose in Python. In Perl you can have this: <pre> $a =~ s/b(..)(..)/; $all = $&; $submatch1 = $1; $submatch2 = $2; </pre> Normally, this looks in Python like: <pre> sre = re.search('b(..)(..)', a) all = sre.group() submatch1 = sre.group(1) submatch2 = sre.group(2) </pre> With my example class it can be done like this: <pre> a / 'b(..)(..)' all = a(0) submatch1 = a(1) submatch2 = a(2) </pre>

This class is just a demonstration. It only implements a few basic methods for using regular expression. Other methods could be added as needed, but I think it would become messy if you tried to make all of Python's re-module accessible through this class.

If you need to do a search and replace on many lines of text, you may not want to use this class. The class compiles and stores regular expressions for reuse, but it is still much slower than repeated calling of a compiled regular expression directly.

Created by Peter Kleiweg on Sun, 29 Aug 2004 (PSF)
Python recipes (4591)
Peter Kleiweg's recipes (3)

Required Modules

Other Information and Tasks