Welcome, guest | Sign In | My Account | Store | Cart

Useful if you need to replace many different characters with all of the same character. Can accept an unlimited number of request to replace.

Does not work with words, only characters. You can, however, replace single characters with words. I may go back and re-implement it using tuples for the keys, but this would make searching the text for any matches pretty expensive, I'd imagine. At that point, it's mostly a job for a regex, and those tools already exist.

Python, 39 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
"""simplified extension of the replace function in python"""

def replacen(text, kwargs):
    """any single character of `text` in `kwarg.keys()` is replaced by `kwarg[key]`
    >>> consonants = replacen('abcdefghijklmnopqrstuvwxyz', {'aeiou':''})
    """
    
    try:
        text = [str(i) for i in text]
    except (ValueError, TypeError):
        raise TypeError("`text` parameter must have valid str type")
        
    #check the contents of each key, make sure there's no overlap:
    collisions = any_key_collisions(kwargs)
    if collisions:
        raise KeyError("keys have duplicate find-replace strings: '%s'" % collisions)
    
    #bring all keys together during character comparisons
    all_keys = [[ix for ix in i] for i in kwargs.keys()]
    for idx, i in enumerate(text):
        for key in all_keys:
            if i in ''.join(key):
                text[idx] = kwargs.get(''.join(key))
    return ''.join(text)

def any_key_collisions(dictionary):
    """ensures no keys contain any other key element, across all keys"""
    members = [i for i in dictionary.keys()]
    dups = []
    for idx, _ in enumerate(members):
        candidate = members[idx * -1]
        if candidate in members[: idx * -1]:
            dups.append(candidate)
    return ''.join(set(dups))
    
if __name__ == '__main__':
    original = "\"This is a quote, 'from a famous book'.\""
    no_punc = replacen(original, {'"\'.' : '', ',': ' --'})
    print original, '\n', no_punc

2 comments

Larry Hastings 9 years, 8 months ago  # | flag
def replacen(s, d):
    for key, value in d.items():
        for c in key:
            s = s.replace(c, value)
    return s

I think this recipe is pointless. Any problem which has a four-line solution is trivial--your solution in forty lines is plodding and unnecessary. Your recipe isn't an example of good Python that will benefit novices, nor is it useful copy-and-paste code to solve a credible problem for Python experts in a hurry. Who is the intended audience here?

Andrew Yurisich (author) 9 years, 8 months ago  # | flag

Replace, as it is, is very heavy-handed. It will successively re-read the same s you've posted, and each time plow over your changes. Imagine:

s = "CATTGACGTATGACTTAGCATTAGACAGGATACAGATAGACAGGACTAGATT"
>>> for k, v in {'A': 'T', 'T': 'A'}.items():
...     seq.replace(k, v)
... 
'CTTTGTCGTTTGTCTTTGCTTTTGTCTGGTTTCTGTTTGTCTGGTCTTGTTT'
'CAAAGACGAAAGACAAAGCAAAAGACAGGAAACAGAAAGACAGGACAAGAAA'

Except, instead of 52 elements in seq, it's closer to 500MB, and iterating through it multiple times would be a waste, not to mention that it produces the wrong end result.

But that's a bad example: here's another.

In (admittedly) bad web sites, UTF-8 characters are not supported, such as ā, ē, ī, ķ, ļ, for the Latvian language. They show up as little blocks when they get posted to the page. There are some workarounds that people use, which could be implemented this way:

ethnocentric_lv = replacen(lv_comment, {'ā': 'aa', 'č': 'ch',... 'ļ': 'lj',... 'ž': 'zh'})

But then again, that has very limited application, admittedly. So, why did I write this? Mainly because this is not exactly github material. This is just a mental note of what I was thinking that day, which was that replace() is a bulldozer of a tool. Yes, I could have used it for my original purpose:

dbox_loc = raw_input('Tell me where your dropbox folder is: ')
stripped = dbox_loc.replace('"', '')
stripped = dbox_loc.replace("'", '')
if os.path.isdir(stripped):
    ...

At first I thought: this won't work. What if there's an apostrophe in the path? Then I thought: why am I doing it this way in the first place? Is there a better way? And it led me here. As my introduction explains, I'd like to expand it to take tuples as dictionary keys, allowing for more useful replacing of words or phrases (or characters, too), and lining up multiple requests for a single pass over the input. Sure, this isn't even close to that, but it's something. Long story short, yes you're right. It really doesn't have a place here, and by using ActiveState's recipe service as a code-centric blog, it should be flagged for deletion.

Created by Andrew Yurisich on Fri, 23 Mar 2012 (MIT)
Python recipes (4591)
Andrew Yurisich's recipes (4)

Required Modules

  • (none specified)

Other Information and Tasks