This recipe shows how to use the Python standard re module to perform single-pass multiple string substitution using a dictionary.
Python, 77 lines
Let's say you have a dictionary-based 1-to-1 mapping between strings. The keys are the set of strings (or regular expression patterns) you want to replace and their corresponding values are the strings they should be replaced by. You could actually do that by calling re.sub for each (key, value) pair in the dictionary, thus processing and creating a new copy of the whole text several times. It would be a lot better to do all the changes in a single pass, so we would need to process and create a copy of the text only once. Fortunately, re.sub's callback facility enables us to do so!
First we have to build a regular expression from our set of keys we want to match. Such a regular expression is a form of "(a1|a2|...|an)" pattern and can be easily generated using a cute oneliner (see source).
Then, instead of feeding re.sub with a replacement string, we'll invoke it with a callable object. This object will be called for any match with a re.MatchObject as its only argument. The re.sub method actually expects this callable object to return the replacement string. This callback routine has just to look-up the matched text in the dictionary and return the corresponding value.
I propose two implementations. The first one is a straighforward lambda-based and the other uses a callable dictionary-like object. This second option is better if you want to perform additional processing on each match such as counting the substitutions or implement any kind of context related complex behaviour.