Welcome, guest | Sign In | My Account | Store | Cart

You often see frequency counts done with dicts, requiring serveral lines of code. Here is a way to do it in one line using itertools and list comprehensions. This revised version was suggested by Raymon Hettinger. It is O(n log n).

Python, 3 lines
 ```1 2 3``` ```>>> from itertools import groupby >>> [(k, len(list(g))) for k, g in groupby(sorted(myList))] [('1', 4), ('2', 1), ('3', 2), ('4', 1)] ```

David Eppstein 17 years, 8 months ago

Wrong tradeoff. Your implementation takes a linear-time call to count for each member in the set. The somewhat less concise dictionary implementation takes only a single linear-time pass through a dictionary. Since when has conciseness been a virtue in Python?

Jason Whitlark (author) 17 years, 8 months ago

For readability, not conciseness. True, it is less efficient, but as a practical matter, I was using it for frequency counts on a list of ~5000 strings, and could not perceive a slowdown. Personally, Id prefer to have 1 line instead of 6, knowing that I can replace it if performance ever becomes a problem. Especially since this version is understandable at a glance.

As Guido says in his optimization anecdote: Rule number one: only optimize when there is a proven speed bottleneck.

Samuel Reynolds 17 years, 5 months ago

Create a function. So create a histogram function

``````def histogram( A, flAsList=False ):
"""Return histogram of values in array A."""
H = {}
for val in A:
H[val] = H.get(val,0) + 1
if flAsList:
return H.items()
return H
``````

Then the call is self-documenting:

``````freq = histogram( myList )
``````
Raymond Hettinger 16 years, 1 month ago

Channeling Guido. Guido almost certainly would find the OP's code to be abhorrent. Admonitions against premature optimization do not equate to a directive to write insanely inefficient code that does not scale. Compression to a single line is only a worthy goal if there is no sacrifice in clarity or performance.

Raymond Hettinger 16 years, 1 month ago

O(n log n) variation. Equivalent to: sort myList | uniq -c.

``````>>> from itertools import groupby
>>> [(k, len(list(g))) for k, g in groupby(sorted(myList))]
[('1', 4), ('2', 1), ('3', 2), ('4', 1)]
``````

Thanks Raymond, for the improvement. I have revised the recipe to match your implementation.

Nick Galbreath 11 years, 9 months ago

At least on my computer, this can be made ~2x faster by eliminating the temporary list.

``````from itertools import groupby
def ilen(iterable):
count = 0
for i in iterable:
count += 1
return count
[(k, ilen(g)) for k, g in groupby(sorted(myList))]
``````

I'm sure there is a better implementation for "ilen". Yeah, i know the recipe said "one-liner", but hopes this helps anyways.

 Created by Jason Whitlark on Sun, 11 Apr 2004 (PSF)

### Required Modules

• (none specified)