Welcome, guest | Sign In | My Account | Store | Cart

By user http://code.activestate.com/recipes/users/2629617/ in comment on http://code.activestate.com/recipes/440698/ but modified slightly.

Splits any string on upper case characters.

Ex.

>>> print split_uppercase("thisIsIt and SoIsThis")
this Is It and  So Is This

note the two spaces after 'and'

Python, 13 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def split_uppercase(str):
	x = ''
	i = 0
	for c in str:
		print c, str[i-1]
		if i == 0: 
			x += c
		elif c.isupper() and not str[i-1].isupper():
			x += ' %s' % c
		else:
			x += c
		i += 1
	return x.strip()

3 comments

Raphael Marvie 14 years, 4 months ago  # | flag

An alternative is to use the re module

import re
def split_uppercase(value):
    return re.sub(r'([A-Z])', r' \1', value)

Having the same behavior.

>>> split_uppercase("thisIsIt and SoIsThis")
'this Is It and  So Is This'
Rogier Steehouder 14 years, 4 months ago  # | flag

Sorry, but people might use this as a learning tool, so I need to point out a few mistakes:

  1. str is the string type and should not be used as a variable name.
  2. Why count i yourself? for i, c in enumerate(str):
  3. Why count at all? Just remember whether the last character was uppercase or not.
  4. In CPython, strings are immutable and string concatenation is notoriously slow. The accepted method is to add strings to a list and join them after the loop.

My version would be:

def split_uppercase(s):
    r = []
    l = False
    for c in s:
        # l being: last character was not uppercase
        if l and c.isupper():
            r.append(' ')
        l = not c.isupper()
        r.append(c)
    return ''.join(r)

You can get rid of the extra space by using l = c.islower() instead of l = not c.isupper().

Also the regular expression given by r. is wrong. It inserts a space before every uppercase character, even the ones following an uppercase character. A better one would be re.sub(r'([^A-Z])([A-Z])', r'\1 \2', value). Also keep in mind that the regex does not take accented characters into account.

Roger 14 years, 4 months ago  # | flag

LOL, first of all - what horrible way of naming variable! of any name, you chose to use "str" ????? This Recipe sucks

Created by activestate on Fri, 11 Dec 2009 (MIT)
Python recipes (4591)
activestate's recipes (1)

Required Modules

  • (none specified)

Other Information and Tasks