Welcome, guest | Sign In | My Account | Store | Cart

Add key= argument to min(), max(), heapq.nsmallest() and nlargest() (Python recipe) by Raymond Hettinger
ActiveState Code (http://code.activestate.com/recipes/355690/)

Modeled after the key= argument to list.sort() and sorted().

      from heapq import nlargest as _nlargest, nsmallest as _nsmallest
from __builtin__ import min as _min, max as _max
from itertools import tee, izip, imap, count
from operator import itemgetter

__all__ = ['nsmallest', 'nlargest', 'min', 'max']

def nsmallest(n, iterable, key=None):
    """Find the n smallest elements in a dataset.

    Equivalent to:  sorted(iterable, key=key)[:n]
    """
    if key is None:
        return _nsmallest(n, iterable)
    in1, in2 = tee(iterable)
    it = izip(imap(key, in1), count(), in2)                 # decorate
    result = _nsmallest(n, it)
    return map(itemgetter(2), result)                       # undecorate

def nlargest(n, iterable, key=None):
    """Find the n largest elements in a dataset.

    Equivalent to:  sorted(iterable, key=key, reverse=True)[:n]
    """
    if key is None:
        return _nlargest(n, iterable)
    in1, in2 = tee(iterable)
    it = izip(imap(key, in1), count(), in2)                 # decorate
    result = _nlargest(n, it)
    return map(itemgetter(2), result)                       # undecorate


def min(iterable, key=None):
    if key is None:
        return _min(iterable)
    it = iter(iterable)
    try:
        min_elem = it.next()
    except StopIteration:
        raise ValueError('min() arg is an empty sequence')
    min_k = key(min_elem)
    for elem in it:
        k = key(elem)
        if k < min_k:
            min_elem = elem
            min_k = k
    return min_elem

def max(iterable, key=None):
    if key is None:
        return _max(iterable)
    it = iter(iterable)
    try:
        max_elem = it.next()
    except StopIteration:
        raise ValueError('max() arg is an empty sequence')
    max_k = key(max_elem)
    for elem in it:
        k = key(elem)
        if k > max_k:
            max_elem = elem
            max_k = k
    return max_elem

      

The key= argument is especially useful with bag-style dictionaries where the values represent item counts:

nlargest(10, bag.iteritems(), key=attrgetter(1))   # ten most frequent keys

This code runs under Py2.4. Because it uses itertools, the extension runs at C speed (entirely avoiding the eval-loop). And, because iterators are used throughout, the extended functions retain the memory friendliness of the original functions.

The decorate step includes calls to count(). This serves as tie breaker when equal keys are encountered. As a result, the sorting is stable and there are no full record comparisons.

The new min() and max() functions use regular for-loops. Because the loops are so tight, the eval-loop overhead is small. Since overhead is less than the cost of decorating, comparing tuples, and undecorating, the itertools approach is not used here.

Tags: shortcuts

Created by Raymond Hettinger on Mon, 6 Dec 2004 (PSF)

◄	Python recipes (4591)	►
◄	Raymond Hettinger's recipes (97)	►

Required Modules

Other Information and Tasks

Licensed under the PSF License
Viewed 13618 times
Revision 3 (updated 13 years ago)

Accounts

Code Recipes

Feedback & Information

ActiveState

© 2024 ActiveState Software Inc. All rights reserved. ActiveState®, Komodo®, ActiveState Perl Dev Kit®, ActiveState Tcl Dev Kit®, ActivePerl®, ActivePython®, and ActiveTcl® are registered trademarks of ActiveState. All other marks are property of their respective owners.

Add key= argument to min(), max(), heapq.nsmallest() and nlargest() (Python recipe) by Raymond Hettinger ActiveState Code (http://code.activestate.com/recipes/355690/)