Welcome, guest | Sign In | My Account | Store | Cart

Modeled after the key= argument to list.sort() and sorted().

Python, 63 lines
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63``` ```from heapq import nlargest as _nlargest, nsmallest as _nsmallest from __builtin__ import min as _min, max as _max from itertools import tee, izip, imap, count from operator import itemgetter __all__ = ['nsmallest', 'nlargest', 'min', 'max'] def nsmallest(n, iterable, key=None): """Find the n smallest elements in a dataset. Equivalent to: sorted(iterable, key=key)[:n] """ if key is None: return _nsmallest(n, iterable) in1, in2 = tee(iterable) it = izip(imap(key, in1), count(), in2) # decorate result = _nsmallest(n, it) return map(itemgetter(2), result) # undecorate def nlargest(n, iterable, key=None): """Find the n largest elements in a dataset. Equivalent to: sorted(iterable, key=key, reverse=True)[:n] """ if key is None: return _nlargest(n, iterable) in1, in2 = tee(iterable) it = izip(imap(key, in1), count(), in2) # decorate result = _nlargest(n, it) return map(itemgetter(2), result) # undecorate def min(iterable, key=None): if key is None: return _min(iterable) it = iter(iterable) try: min_elem = it.next() except StopIteration: raise ValueError('min() arg is an empty sequence') min_k = key(min_elem) for elem in it: k = key(elem) if k < min_k: min_elem = elem min_k = k return min_elem def max(iterable, key=None): if key is None: return _max(iterable) it = iter(iterable) try: max_elem = it.next() except StopIteration: raise ValueError('max() arg is an empty sequence') max_k = key(max_elem) for elem in it: k = key(elem) if k > max_k: max_elem = elem max_k = k return max_elem ```

The key= argument is especially useful with bag-style dictionaries where the values represent item counts:

``````nlargest(10, bag.iteritems(), key=attrgetter(1))   # ten most frequent keys
``````

This code runs under Py2.4. Because it uses itertools, the extension runs at C speed (entirely avoiding the eval-loop). And, because iterators are used throughout, the extended functions retain the memory friendliness of the original functions.

The decorate step includes calls to count(). This serves as tie breaker when equal keys are encountered. As a result, the sorting is stable and there are no full record comparisons.

The new min() and max() functions use regular for-loops. Because the loops are so tight, the eval-loop overhead is small. Since overhead is less than the cost of decorating, comparing tuples, and undecorating, the itertools approach is not used here. Created by Raymond Hettinger on Mon, 6 Dec 2004 (PSF)