ActiveState Code

Recipe 65251: Calculating Apache hits per IP


This function returns a dictionary containing the hit counts for each individual IP that has accessed your Apache web server.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def CalculateApacheIpHits(logfile_pathname):

	# make a dictionary to store Ip's and their hit counts and read the
        # contents of the logfile line by line

	IpHitListing = {}
	Contents = open(logfile_pathname, "r").readlines()
	
	# go through each line of the logfile
	for line in Contents:

                #split the string to isolate the ip
                Ip = line.split(" ")[0]

                # ensure length of the ip is proper: see discussion
		if 6 < len(Ip) < 15:
			# Increase by 1 if ip exists else hit count = 1
                        IpHitListing[Ip] = IpHitListing.get(Ip, 0) + 1

	return IpHitListing

# example usage
HitsDictionary = CalculateApacheIpHits("/usr/local/nusphere/apache/logs/access_log")
print HitsDictionary["127.0.0.1"]

Discussion

This function is quite useful for many things. For one, I often use it in my code to determine how many of my "hits" are actually originating from locations other than my local host. This function was also used to chart which IP's are most actively viewing pages that are served by a particular installation of Apache.

As for the method of "validating" the IP, it as is follows: 1) an IP address will never be longer than 15 digits (4 sets of triplets and 3 periods); 2) an IP address will never be shorter than 6 digits (4 sets of single digits and 3 periods). The whole purpose of this validation is not to enforce stigent validation (for that we could use a regular expression), but rather to avoid the possibility of putting blantently "garbage" data into the dictionary.

Comments

  1. 1. At 8:09 a.m. on 26 jun 2001, Peter Bengtsson said:

    line.split ??? Ip = line.split(" ")[0]

    AttributeError: 'string' object has no attribute 'split'

  2. 2. At 5:39 a.m. on 27 jun 2001, Mark Nenadov (the author) said:

    Older version of Python.. You must be using Python 1.52, or maybe 1.6, or something?

    ".split" (without importing 'string') is a feature introduced in Python 2.0+

    To make that work in older versions you will need to change it to something like:

    import string

    Ip = string.split("", line)

  3. 3. At 1:09 a.m. on 27 aug 2001, Ahsan A said:

    Strop. In Python 2.0 and above the string module is used for backward compatibility. Instead, a builtin strop module is used.

    And now, even strop is obsolete, with the string functions being built-in

    If you use the string module nevertheless, there is no overhead.

  4. 4. At 1:03 a.m. on 8 mar 2004, Ivo Woltring said:

    Thanks for the code but small bug. the part where is tested for a good IP is not wholy correct:

    ensure length of the ip is proper: see discussion

    if 6 the part where is tested for a good IP is not wholy correct:

    ensure length of the ip is proper: see discussion

    if 6

  5. 5. At 1:13 a.m. on 8 mar 2004, Ivo Woltring said:

    again the small bug. # ensure length of the ip is proper: see discussion

    if 6 < len(Ip) < 15:

    should be:

    if 6 < len(Ip) < 16:

Sign in to comment