Welcome, guest | Sign In | My Account | Store | Cart

This class can be used to tally objects, for example when analyzing log files. It has two separate ways of calculating the "score board", one quicker for Python >=2.4 and one that works with older versions (though I doubt it works with anything <2.0).

As the class uses a dictionary, only hashables can be counted. Other than that, you can mix and match them however you want.

Python, 62 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
from sys import version_info as pyver


class Tally:
    """ 
    A tally or histogram class

    Counts the occurances of objects (hashables only) given to it.
    The number of seen samples is stored in the object's samples 
    instance variable.

    """
    def __init__(self):
        self.samples = 0L
        self.values = {}

    def incr(self, k, amount = 1):
        """
        Increase count for k
        """
        self.samples += 1
        self.values[k] = self.values.get(k, 0) + amount

    def decr(self, k, amount = 1):
        """
        Decrease count for k
        """
        self.samples += 1
        self.values[k] = self.values.get(k, 0) - amount

    def val(self, k):
        """
        Return count for k
        """
        return self.values.get(k, 0)

    def getkeys(self):
        """
        Return all histogram keys
        """
        return self.values.keys()

    def counts(self, desc = False):
        '''
        Return list of keys, sorted by values.
        If desc is True, return a descending sort.
        '''
        if (pyver[0] < 2) or (pyver[0] == 2 and pyver[1] < 4):
            # This is for Python versions <2.4
            i = map(lambda items: list(items), self.values.items())
            map(lambda rev: rev.reverse(), i)
            i.sort()
            if desc:
                i.reverse()
            return i

        else:
            # This only works with Python >=2.4 but is much
            # faster than the code above.

            return sorted(( list(reversed(items)) \
                for items in self.values.iteritems() ), reverse = desc)

I keep using this class when writing log file analyzers. Usually, a count of clients, IPs, messages, events etc. is at least part of the analysis. Using a dictionary makes add items relatively fast. If this class is too slow, I use the code inline, removing all function/method calls (but making the code less readable). I haven't taken any measurements, but the class may benefit from Psyco.

3 comments

thanos vassilakis 16 years, 11 months ago  # | flag

A simple counter. I and others use something like this:

class Counter(dict):
    def count(self, what):
        try:
            self[what] +=1
        except KeyError:
            self[what] =1
        return self

For sample I just

sum(someCounter.values())

Also Your sort could be a lot simpler: See other recipes.

Matteo Dell'Amico 16 years, 11 months ago  # | flag

Look at recipes for bag. There is a recipe for bags that solves the same problem, so you might want to check it.

Also, in python 2.5, defaultdict(int) does pretty much what you need.

klausman-aspn (author) 16 years, 10 months ago  # | flag

thx. Thanks for the comments.

The recipe is a typical result of bein entrenched. I originally wrote it in the times of 1.5.2, when most of the fancier stuff in Python either wasn't there or was slow. As it worked for me, I never bothered to rethink the entire approach.

Lesson learned.

Created by klausman-aspn on Mon, 21 May 2007 (PSF)
Python recipes (4591)
klausman-aspn's recipes (1)

Required Modules

Other Information and Tasks