Welcome, guest | Sign In | My Account | Store | Cart

Function to clean trailing and or preceeding whitespace from string types in complex list, dictionary, and tuple structures. This is a recursive function to allow for complete coverage of all items in the structure. Wanted to share it as I needed it and after searching for a while I gave up and wrote one.

For example a = ["\rText \r\n", "This one is fine", ["stuff ", [" Something Else"], 4, "Another ", "one", " with"], "\twhitespace\r\n"]

print cleanWhiteSpace(a) Result: ["Text", "This one is fine", ["stuff", ["Something Else"], 4, "Another", "one", "with"], "whitespace"]

Python, 20 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
import types

def cleanWhiteSpace(obj):
	objType = type(obj)
	if(objType is types.StringType):		# String
		# Clean regular string
		return obj.lstrip().rstrip()
	elif((objType is types.ListType) or (objType is types.TupleType)):		# List or Tuple
		out = [] 
		for ele in obj:		# Iterate the elements
			out.append(cleanWhiteSpace(ele))	# Recurse into this function for the element
		return out
	elif(objType is types.DictType):		# Dictionary 
		out = {}
		for ele in obj:		# Iterate the elements
			out[ele] = cleanWhiteSpace(obj[ele])	# Recurse into this function for the element
		return out
	else:
		# Non String or list object return it
		return obj

2 comments

Edward Loper 15 years, 8 months ago  # | flag

A few recommendations:

  • Use isinstance(...) rather than checking type identity
  • Use obj.strip() rather than obj.rstrip().lstrip()
  • It's standard in python to use lowercase_with_underscores for functions, not camelCase. Same for variable names (obj_type not objType).
  • It would be a little cleaner to do the

Putting all that together, I'd recommend the following rewrite:

>>> def clean_whitespace(obj):
...     if isinstance(obj, basestring):
...         return obj.strip()
...     elif isinstance(obj, list):
...         return [clean_whitespace(o) for o in obj]
...     elif isinstance(obj, tuple):
...         return tuple(clean_whitespace(o) for o in obj)
...     elif isinstance(obj, dict):
...         return dict((k, clean_whitespace(v)) for (k,v) in obj.items())
...     else:
...         return obj

(n.b. that it does not recurse to dictionary keys -- this is to be consistent with your version. Obviously, it would be easy to change it to do so.)

Garron Moore 15 years, 8 months ago  # | flag

Taking the previous recommendation as an example, you could go a step further in abstracting the functionality to work with any object and operation.

def recurse_into(obj, baseaction, basetype=basestring):
    if isinstance(obj, basetype):
        return baseaction(obj)
    elif isinstance(obj, list):
        return [recurse_into(o, baseaction, basetype) for o in obj]
    elif isinstance(obj, tuple):
        return tuple(recurse_into(o, baseaction, basetype) for o in obj)
    elif isinstance(obj, dict):
        return dict((k, recurse_into(v, baseaction, basetype)) for (k, v) in obj.items())
    else:
        return obj

def generate_recurse(baseaction, basetype=basestring):
    def f(obj):
        return recurse_into(obj, baseaction, basetype)
    return f

To use this to accomplish the original task of cleaning whitespace:

import string
clean_whitespace = generate_recurse(string.strip)
Created by Will on Wed, 6 Aug 2008 (MIT)
Python recipes (4591)
Will's recipes (1)

Required Modules

Other Information and Tasks