The default compare function used when sorting a list of strings uses the ordinal value of the string characters for comparison. This results in the typical ASCII sort result of 'B' < 'a'. If a user is going to see the list, it is generally better to do a case-insensitive comparison.
1 2 3 4 5 6 7 8 9 | def caseinsensitive_sort(stringList):
"""case-insensitive string comparison sort
doesn't do locale-specific compare
though that would be a nice addition
usage: stringList = caseinsensitive_sort(stringList)"""
tupleList = [(x.lower(), x) for x in stringList]
tupleList.sort()
return [x[1] for x in tupleList]
|
While you can provide your own compare method to pass to the sort method such as:
def compare(a, b): return cmp(a.lower(), b.lower())
stringList.sort(compare)
or use a lambda
stringList.sort(lambda a, b: cmp(a.lower(), b.lower()))
this results in the lower() conversion for every comparison. Thanks to Martin v. Loewis for making this all clear.
you can amend the recipe to sort in-place as well. list.sort, significantly, sorts the list object in place (as links to this recipe have mentioned). The method returns None, not the list object, to remind you of this side effect.
If you wish to constrain your sort implementation to behave as list.sort -- either to subclass list or merely to have your sort routines be consistent with list.sort -- use the function in the recipe, but at the end of the function, do
and that will effect the sort "in place".
IIRC, list.sort in CPython, implemented natively in C, takes some precautions to avoid problems with concurrent use of the list by another thread while it's being sorted by the thread doing list.sort(). I don't know whether those precautions are important enough to use that they preclude using your sort function with a full-on replacement of the list contents at the end.
You could try using a compare function that uses a dict object to "memoize" the results of upper on each element.
While correct, the above is about twice as slow as list.sort(lambda a,b: cmp(a.upper(), b.upper()) for a list of 52 unique strings sharing 26 unique upper-case representations.
Schwartzian Transform. This is an example of a Schwartzian Transform (a Perl concept that works in Python too). It's in the Python FAQ:
http://www.python.org/cgi-bin/faqw.py?req=show&file=faq04.051.htp
The link in Guido's ancient comment is broken. I think its equivalent would likely now be the to the section titled The Old Way Using Decorate-Sort-Undecorate in the Python Sorting HOWTO -- which refers to it by another common name for the technique.