This function accepts a string and returns a string with whitespace(one space) inserted between words with leading capitalized letters.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | import re
def split_on_caps(str):
rs = re.findall('[A-Z][^A-Z]*',str)
fs = ""
for word in rs:
fs += " "+word
return fs
#test
if __name__ == "__main__":
print split_on_caps("DonkeyIsAGreatBeastYIP")
---> Donkey Is A Great Beast Y I P
|
usefull for getting human readable names of funtion names. ie: thisFunctionDoesSomething() --> this Function Does Something()
Another way to do this without using RE is to use the following in place of the re.findall():
search = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' sentence = 'The Quick Brown Fox Bites The Lazy Dog' "string recieved" i=0 for letter in sentence: if letter in search: print 'letter ',letter,' found at pos:',i i+=1
Issues: Doesn't return any text before the first uppercase letter. simple to fix
Helpful info, #python on freenode. http://www.amk.ca/python/howto/regex/ Thanks to RexFi for the re help.
Doesn't work as specified. This won't work as advertised, for instance, with 'thisFunctionDoesSomething'. Since 'this' won't match the [A-Z][^A-Z] regexp, it won't show up in re.findall.
This function will add a space in each uppercase-lowercase boundary.
Completely Useless. thisFunctionDoesSomething() is a perfectly-readable function name.
Regex Correction. First off, I'd like the say that this recipe is a complete waste of time. Please don't ask me why I'm spending time fixing it.
Secondly, there's an error in your regex.
That can be fixed with the following...
Simple. No regex. def split_uppercase(string): x='' for i in string: if i.isupper(): x+=' %s' %i else: x+=i return x.strip()
I like luis gonzalez's idea, but appending to a string is expensive in Python, as the entire string has to be copied. Appending to an array is cheaper, and typically saves enough time that you don't mind doing the join. I also spiced things up a bit with some extra options.
Use it like this:
Any of these solutions using
re
,string.uppercase
,str.isupper
, etc. only works for ASCII uppercase, or at best is locale dependant.Once approach that works for non-English text, and is locale independent, is to use
[:upper:]
andregex.UNICODE
with theregex
module.