Welcome, guest | Sign In | My Account | Store | Cart

Recently, there has been a discussion between myself and another individual as to the performance of particular web servers under certain situations. Being that web server testing frameworks have different ways of measuring how long a 'request' takes, I thought I would take the guesswork out of it and measure everything explicitly.

There are generally 5 portions to a web request: 1. create the connection 2. start sending the request 3. finish sending the request 4. start recieving the response 5. finish reading the response

This recipe measures the amount of time to perform each portion of a request to a single file many times, and prints the results in a somewhat reasonable fashion.

Python, 191 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
'''
test_web_server.py
- a script to test some basic metrics of a web server

Sample output:
D:\test_web_server>test_web_server.py -c10 -t15 -p987 -s -f/links.html
**************official results (1934):
           connect  request get some get rest    total
          -------- -------- -------- -------- --------
average:      0.02     0.00     0.02     0.05     0.08
median:       0.02     0.00     0.02     0.04     0.08
std dev:      0.02     0.00     0.01     0.02     0.03
minimum:      0.00     0.00     0.00     0.00     0.01
maximum:      0.47     0.03     0.22     0.27     0.55
approximate requests/second:  128.266347738
Total bytes transferred:      31779488
Bytes transferred per second: 2103208


In this particular run, it was downloading a 16281 byte file called
links.html from a web server running on the local host.

1.1
 - Adds handling of new connection creation failure, includes such failures
   into the connect time, and keeps a running total of failures.
 - Added support for zero total results.
1.2
 - Adds data transfer rates and totals.
1.3
 - Adds support for Host: header for HTTP 1.1 servers.
'''


import sys
import socket
import threading
import Queue
import time
import optparse

results = Queue.Queue()
refused = 0L
transferred = 0L
reflock = threading.Lock()

endtime = None

def worker(host, port, file, include_host):
    C = 0
    D = 0
    if include_host:
        request = 'GET /%s HTTP/1.1\r\nHost: %s\r\n\r\n'%(file, host)
    else:
        request = 'GET /%s HTTP/1.1\r\n\r\n'%file
    
    t = [time.time()]
    
    while time.time() < endtime:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        try:
            s.connect((host, port))
        except:
            C += 1
            s.close()
            continue
        t.append(time.time())
        s.sendall(request)
        t.append(time.time())
        try:
            while 1:
                _ = s.recv(65536)
                if not _:
                    break
                elif len(t) == 3:
                    t.append(time.time())
                D += len(_)
        except:
            pass
        s.close()
        while len(t) < 5:
            t.append(time.time())
        t2 = []
        x = t.pop(0)
        while t:
            y = t.pop(0)
            t2.append(y-x)
            x = y
        results.put(t2)
        t = [time.time()]
    reflock.acquire()
    global refused, transferred
    refused += C
    transferred += D
    reflock.release()

def _stats(r):
    #returns the median, average, standard deviation, min and max of a sequence
    tot = sum(r)
    avg = tot/len(r)
    sdsq = sum([(i-avg)**2 for i in r])
    s = list(r)
    s.sort()
    return s[len(s)//2], avg, (sdsq/(len(r)-1 or 1))**.5, min(r), max(r)

x = ('average: ', 'median:  ', 'std dev: ', 'minimum: ', 'maximum: ')

def stats(r, e):
    for i in r:
        i.append(sum(i))
    
    s = zip(*map(_stats, zip(*r)))
    print "           connect  request get some get rest    total"
    print "          -------- -------- -------- -------- --------"
    for i,j in zip(x, s):
        print i, "%8.2f %8.2f %8.2f %8.2f %8.2f"%j
    print "approximate requests/second: ", len(r)/float(e)

if __name__ == '__main__':
    usage = "usage: \%prog -c<count> -t<time> -H<host> -p<port> -f<file>"
    
    parser = optparse.OptionParser(usage)
    parser.add_option('-c', '--count', dest='count', type='int',
        help='Number of simultaneous threads (default 5)', default=5,
        action='store')
    parser.add_option('-t', '--time', dest='time', type='int',
        help='At least how long in seconds to run the test (default 60)',
        default=60, action='store')
    parser.add_option('-H', '--host', dest='host',
        help='The host of the web server (default localhost)',
        default='localhost', action='store')
    parser.add_option('-i', '--include', dest='include', action='store_true',
        help='if passed, will include Host: header as specified with -H in the request',
        default=False)
    parser.add_option('-p', '--port', dest='port', type='int',
        help='Port to connect to on (default 80)', default=80, action='store')
    parser.add_option('-f', '--file', dest='file',
        help='the file to download', action='store')
    parser.add_option('-s', '--single', dest='single', action='store_true',
        help='if passed, will only produce one table of output', default=False)
    
    options, args = parser.parse_args()
    
    if options.file is None:
        parser.error('need file to fetch')
    
    starttime = time.time()
    endtime = starttime + options.time
    for i in xrange(options.count):
        threading.Thread(target=worker,
            args=(options.host, options.port,
                  options.file.lstrip('/\\'), options.include)).start()
    if not options.single:
        while endtime > time.time():
            time.sleep(.1)
        
        r = []
        while results.qsize():
            r.append(results.get())
        rc = len(r)
        if r:
            print "**************official results (%i):"%(len(r))
            stats(r, options.time)
        
        while threading.activeCount() > 1:
            time.sleep(.1)
        
        r = []
        while results.qsize():
            r.append(results.get())
        if r:
            print "**************late finishers (%i):"%(len(r))
            stats(r, time.time()-endtime)
    
        print "effective requests/second:   ", (rc+len(r))/(time.time()-starttime)
    
    else:
        while threading.activeCount() > 1:
            time.sleep(.1)
        
        r = []
        while results.qsize():
            r.append(results.get())
        if r:
            print "**************official results (%i):"%(len(r))
            stats(r, time.time()-starttime)
    
    print "Total bytes transferred:     ", transferred
    print "Bytes transferred per second:", int(transferred/(time.time()-starttime))
    
    if refused:
        print "Connections refused:         ", refused
    

This recipe will repeatedly download one file from a given server for some amount of time with some number of clients. It could be used as a load-test and/or a benchmark of the performance of a particular web server. Obviously it would be convenient for it to request multiple different files, perhaps to simulate some sort of standard web traffic, but it is rainy outside, and I don't particularly feel like doing so.

6 comments

Danny Adair 18 years, 5 months ago  # | flag

Why not ab? Afaik, Apache Benchmark does all that and more.

Josiah Carlson (author) 18 years, 5 months ago  # | flag

I needed a simple tool which was able to measure times where I specified them. It wasn't meant to be a full-on benchmarking tool, so I am not surprised that apache bench supports additional options, though I am surprised to note that ab does not measure a superset of the times that I do, nor does it include median and standard deviation in its statistics. It does offer data transfer totals and rates, which I should probably add into my recipe, if not to be complete, but to save people from having to work out data transfer totals and rates by hand afterwards (I have).

Ralph Kammerlander 18 years, 3 months ago  # | flag

runs out of the box... It is late friday evening and i needed something to stress the tcp stack /http server of an embedded device over the weekend. Found your script ... It perfectly matches my needs. This is a really good example for handy code snippets.

Noah Spurrier 17 years, 8 months ago  # | flag

Thanks! I hacked this to return stats on the top 10 open socket connections (from 'netstat -n --inet'). When the stddev of the max goes over 3 then I send an alert. Apache Bench may do more, but this is hackable. Took me all of 20 minutes...

Mark Sweeting 17 years, 7 months ago  # | flag

Update for http/1.1 hosts. This is a great script - thanks for sharing.

I had a problem hitting an HTTP/1.1 server as it was expecting a Host: header, so I replaced the following line:

s.sendall('GET /%s HTTP/1.1\r\n\r\n'%file)

with

s.sendall('GET /%s HTTP/1.1\r\nHost: %s\r\n\r\n'% (file,host))
Josiah Carlson (author) 17 years, 5 months ago  # | flag

I have added support for passing the Host: header. Thank you for the suggestion.