This is a recipe that does a case insensitive sort. The normal sort methods of lists has 'B'<'a', which means that it would sort 'Pear' to come before 'apple' in a list. You can pass in a function to the sort method to change this... but this can be slow. This is a function that transforms the list, uses the sort method and then transforms it back.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | # 06-07-04
#v1.0.0
# caselesssort.py
# A sort function for lists of strings that is case insensitive.
# Copyright Michael Foord
# You are free to modify, use and relicense this code.
# No warranty express or implied for the accuracy, fitness to purpose or otherwise for this code....
# Use at your own risk !!!
# E-mail michael AT foord DOT me DOT uk
# Maintained at www.voidspace.org.uk/atlantibots/pythonutils.html
"""
The built in sort method for lists is case sensitive.
This means it can be unsuitable for sorting some lists of strings.
e.g. ['Apple', 'Pear', 'apple'].sort()
leaves 'Apple' and 'apple' at opposite ends of the list.
You can pass in a function to the sort method - but this can be very slow.
cSort still uses the sort method, so there isn't much performance hit, but it is caseless.
cSort can handle non-string members in the list without failing.
In addition cSort will sort any sets of entries for which entry1.lower() == entry2.lower()
i.e. cSort(['fish', 'FISH', 'fIsh'])
returns ['FISH', 'fIsh', 'fish']
You can turn this behaviour off by passing cSort an optional 'False' parameter.
i.e. cSort(['fish', 'FISH', 'fIsh'], False)
returns ['fish', 'FISH', 'fIsh']
"""
def cSort(inlist, minisort=True):
sortlist = []
newlist = []
sortdict = {}
for entry in inlist:
try:
lentry = entry.lower()
except AttributeError:
sortlist.append(lentry)
else:
try:
sortdict[lentry].append(entry)
except KeyError:
sortdict[lentry] = [entry]
sortlist.append(lentry)
sortlist.sort()
for entry in sortlist:
try:
thislist = sortdict[entry]
if minisort: thislist.sort()
newlist = newlist + thislist
except KeyError:
newlist.append(entry)
return newlist
if __name__ == '__main__':
list1 = ['pish', 'fish', 'FISH', 'Fish', 'PISH', 'FIsh', 'fiSH', 'Pish','piSH']
list2 = list(list1)
print 'Here is an unsorted list :'
print list1
list1.sort()
print 'Here is a list sorted using list.sort() :'
print list1
print 'Here is the list sorted using cSort(list) :'
print cSort(list2)
print 'Here is the list sorted using cSort(list, False) :'
print cSort(list2, False)
"""
TODO/ISSUES
CHANGELOG
06-07-04 Version 1.0.0
A working caseless sort.
Will be part of the caseless module, but also stands on it's own.
"""
|
In the caseless module I am using this to become the sort method for a subclass of list. That class depends on python 2.2 as it subclasses list - but this function ought to work on versions earlier than that.
There is possibly a neater solution that subclasses string and stores the real value of the string as an attribute and you sort the apparent values - don't think it's much quicker.
This is already in the cookbook, only three lines of code. See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/170242
That solution is short but requires a list of strings. If heterogenous lists are possible adding a type check is easy enough, but I wonder at the practical applications of sorting lists of different types of objects.
Greg Jorgensen PDXperts LLC Portland, Oregon USA
Hmm.... Mine will also sort the list members where member1.lower() == member2.lower()
However - the one you point to is still elegant. Mine is similar in principal but stores a dictionary of all the keys and then sorts the keys - but it does the type checking as well.