Welcome, guest | Sign In | My Account | Store | Cart

sort a sequence by applying decoration only as needed

Python, 57 lines
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 def lazyDSU_sort(seq, keys, warn=False): """Return sorted seq, breaking ties by applying keys only when needed. If ``warn`` is True then an error will be raised if there were no keys remaining to break ties. Examples ======== Here, sequences are sorted by length, then sum: >>> seq, keys = [[[1, 2, 1], [0, 3, 1], [1, 1, 3], [2], [1]], [ ... lambda x: len(x), ... lambda x: sum(x)]] ... >>> lazyDSU_sort(seq, keys) [[1], [2], [1, 2, 1], [0, 3, 1], [1, 1, 3]] If ``warn`` is True, an error will be raised if there were not enough keys to break ties: >>> lazyDSU_sort(seq, keys, warn=True) Traceback (most recent call last): ... ValueError: not enough keys to break ties Notes ===== The decorated sort is one of the fastest ways to sort a sequence for which special item comparison is desired: the sequence is decorated, sorted on the basis of the decoration (e.g. making all letters lower case) and then undecorated. If one wants to break ties for items that have the same decorated value, a second key can be used. But if the second key is expensive to compute then it is inefficient to decorate all items with both keys: only those items having identical first key values need to be decorated. This function applies keys successively only when needed to break ties. """ from collections import defaultdict d = defaultdict(list) keys = list(keys) f = keys.pop(0) for a in seq: d[f(a)].append(a) seq = [] for k in sorted(d.keys()): if len(d[k]) > 1: if keys: d[k] = lazyDSU_sort(d[k], keys, warn) elif warn: raise ValueError('not enough keys to break ties') seq.extend(d[k]) else: seq.append(d[k][0]) return seq

One problem with the DSU (decorate, sort, undecorate) idiom for sorting is that one must calculate the decoration -- all decorations, if multiple are being used -- before one knows how deep of a decoration is needed to break ties. This recipe applies additional decoration only when needed.

See also this recipe for discussion of DSU sorting.

 Created by Chris Smith on Wed, 12 Sep 2012 (MIT)

### Required Modules

• (none specified)