Welcome, guest | Sign In | My Account | Store | Cart

A simple HTTP Server, intended to be as simple as the standard module SimpleHTTPServer, built upon the asyncore/asynchat modules (uses non-blocking sockets). Provides a Server (copied from medusa http_server) and a RequestHandler class. RequestHandler handles both GET and POST methods and inherits SimpleHTTPServer.SimpleHTTPRequestHandler

It can be easily extended by overriding the handle_data() method in the RequestHandler class

Python, 248 lines
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
"""Simple HTTP server based on the asyncore / asynchat framework

Under asyncore, every time a socket is created it enters a table which is
scanned through select calls by the asyncore.loop() function

All events (a client connecting to a server socket, a client sending data, 
a server receiving data) is handled by the instances of classes derived 
from asyncore.dispatcher

Here the server is represented by an instance of the Server class

When a client connects to it, its handle_accept() method creates an
instance of RequestHandler, one for each HTTP request. It is derived
from asynchat.async_chat, a class where incoming data on the connection
is processed when a "terminator" is received. The terminator can be :
- a string : here we'll use the string \r\n\r\n to handle the HTTP request
line and the HTTP headers
- an integer (n) : the data is processed when n bytes have been read. This
will be used for HTTP POST requests

The data is processed by a method called found_terminator. In RequestHandler,
found_terminator is first set to handle_request_line to handle the HTTP
request line (including the decoding of the query string) and the headers. 
If the method is POST, terminator is set to the number of bytes to read
(the content-length header), and found_terminator is set to handle_post_data

After that, the handle_data() method is called and the connection is closed

Subclasses of RequestHandler only have to override the handle_data() method
"""

import asynchat, asyncore, socket, SimpleHTTPServer, select, urllib
import posixpath, sys, cgi, cStringIO, os, traceback, shutil

class CI_dict(dict):
    """Dictionary with case-insensitive keys
    Replacement for the deprecated mimetools.Message class
    """

    def __init__(self, infile, *args):
        self._ci_dict = {}
        lines = infile.readlines()
        for line in lines:
            k,v=line.split(":",1)
            self._ci_dict[k.lower()] = self[k] = v.strip()
        self.headers = self.keys()
    
    def getheader(self,key,default=""):
        return self._ci_dict.get(key.lower(),default)
    
    def get(self,key,default=""):
        return self._ci_dict.get(key.lower(),default)
    
    def __getitem__(self,key):
        return self._ci_dict[key.lower()]
    
    def __contains__(self,key):
        return key.lower() in self._ci_dict
        
class socketStream:

    def __init__(self,sock):
        """Initiate a socket (non-blocking) and a buffer"""
        self.sock = sock
        self.buffer = cStringIO.StringIO()
        self.closed = 1   # compatibility with SocketServer
    
    def write(self, data):
        """Buffer the input, then send as many bytes as possible"""
        self.buffer.write(data)
        if self.writable():
            buff = self.buffer.getvalue()
            # next try/except clause suggested by Robert Brown
            try:
                    sent = self.sock.send(buff)
            except:
                    # Catch socket exceptions and abort
                    # writing the buffer
                    sent = len(data)

            # reset the buffer to the data that has not yet be sent
            self.buffer=cStringIO.StringIO()
            self.buffer.write(buff[sent:])
            
    def finish(self):
        """When all data has been received, send what remains
        in the buffer"""
        data = self.buffer.getvalue()
        # send data
        while len(data):
            while not self.writable():
                pass
            sent = self.sock.send(data)
            data = data[sent:]

    def writable(self):
        """Used as a flag to know if something can be sent to the socket"""
        return select.select([],[self.sock],[])[1]

class RequestHandler(asynchat.async_chat,
    SimpleHTTPServer.SimpleHTTPRequestHandler):

    protocol_version = "HTTP/1.1"
    MessageClass = CI_dict

    def __init__(self,conn,addr,server):
        asynchat.async_chat.__init__(self,conn)
        self.client_address = addr
        self.connection = conn
        self.server = server
        # set the terminator : when it is received, this means that the
        # http request is complete ; control will be passed to
        # self.found_terminator
        self.set_terminator ('\r\n\r\n')
        self.rfile = cStringIO.StringIO()
        self.found_terminator = self.handle_request_line
        self.request_version = "HTTP/1.1"
        # buffer the response and headers to avoid several calls to select()
        self.wfile = cStringIO.StringIO()

    def collect_incoming_data(self,data):
        """Collect the data arriving on the connexion"""
        self.rfile.write(data)

    def prepare_POST(self):
        """Prepare to read the request body"""
        bytesToRead = int(self.headers.getheader('content-length'))
        # set terminator to length (will read bytesToRead bytes)
        self.set_terminator(bytesToRead)
        self.rfile = cStringIO.StringIO()
        # control will be passed to a new found_terminator
        self.found_terminator = self.handle_post_data
    
    def handle_post_data(self):
        """Called when a POST request body has been read"""
        self.rfile.seek(0)
        self.do_POST()
        self.finish()
            
    def do_GET(self):
        """Begins serving a GET request"""
        # nothing more to do before handle_data()
        self.body = {}
        self.handle_data()
        
    def do_POST(self):
        """Begins serving a POST request. The request data must be readable
        on a file-like object called self.rfile"""
        ctype, pdict = cgi.parse_header(self.headers.getheader('content-type'))
        self.body = cgi.FieldStorage(fp=self.rfile,
            headers=self.headers, environ = {'REQUEST_METHOD':'POST'},
            keep_blank_values = 1)
        self.handle_data()

    def handle_data(self):
        """Class to override"""
        f = self.send_head()
        if f:
            self.copyfile(f, self.wfile)

    def handle_request_line(self):
        """Called when the http request line and headers have been received"""
        # prepare attributes needed in parse_request()
        self.rfile.seek(0)
        self.raw_requestline = self.rfile.readline()
        self.parse_request()

        if self.command in ['GET','HEAD']:
            # if method is GET or HEAD, call do_GET or do_HEAD and finish
            method = "do_"+self.command
            if hasattr(self,method):
                getattr(self,method)()
                self.finish()
        elif self.command=="POST":
            # if method is POST, call prepare_POST, don't finish yet
            self.prepare_POST()
        else:
            self.send_error(501, "Unsupported method (%s)" %self.command)

    def end_headers(self):
        """Send the blank line ending the MIME headers, send the buffered
        response and headers on the connection, then set self.wfile to
        this connection
        This is faster than sending the response line and each header
        separately because of the calls to select() in socketStream"""
        if self.request_version != 'HTTP/0.9':
            self.wfile.write("\r\n")
        self.start_resp = cStringIO.StringIO(self.wfile.getvalue())
        self.wfile = socketStream(self.connection)
        self.copyfile(self.start_resp, self.wfile)

    def handle_error(self):
        traceback.print_exc(sys.stderr)
        self.close()

    def copyfile(self, source, outputfile):
        """Copy all data between two file objects
        Set a big buffer size"""
        shutil.copyfileobj(source, outputfile, length = 128*1024)

    def finish(self):
        """Send data, then close"""
        try:
            self.wfile.finish()
        except AttributeError: 
            # if end_headers() wasn't called, wfile is a StringIO
            # this happens for error 404 in self.send_head() for instance
            self.wfile.seek(0)
            self.copyfile(self.wfile, socketStream(self.connection))
        self.close()

class Server(asyncore.dispatcher):
    """Copied from http_server in medusa"""
    def __init__ (self, ip, port,handler):
        self.ip = ip
        self.port = port
        self.handler = handler
        asyncore.dispatcher.__init__ (self)
        self.create_socket (socket.AF_INET, socket.SOCK_STREAM)

        self.set_reuse_addr()
        self.bind ((ip, port))

        # lower this to 5 if your OS complains
        self.listen (1024)

    def handle_accept (self):
        try:
            conn, addr = self.accept()
        except socket.error:
            self.log_info ('warning: server accept() threw an exception', 'warning')
            return
        except TypeError:
            self.log_info ('warning: server accept() threw EWOULDBLOCK', 'warning')
            return
        # creates an instance of the handler class to handle the request/response
        # on the incoming connexion
        self.handler(conn,addr,self)

if __name__=="__main__":
    # launch the server on the specified port
    port = 8081
    s=Server('',port,RequestHandler)
    print "SimpleAsyncHTTPServer running on port %s" %port
    try:
        asyncore.loop(timeout=2)
    except KeyboardInterrupt:
        print "Crtl+C pressed. Shutting down."

The standard Python distribution provides two modules for asynchronous socket programming, asyncore and asynchat. Although it seemed very interesting, it took me some time to begin to understand how it works. The medusa framework is based on it, works very well but I'm afraid it does not provide a very clear documentation either

For an example of a server built upon SimpleAsyncHTTPServer, see Karrigell_async in the web development tool I work on, called Karrigell (http://karrigell.sourceforge.net)

11 comments

Bernhard Mulder 20 years, 3 months ago  # | flag

socket blocking. The write method of SocketStream can throw a message on long files, because 'the socket operation can not complete without blocking'.

To fix that, the response should be written out asynchroneously. Not sure about the details.

Wil Wil2k 19 years, 10 months ago  # | flag

socket blocking .. reloaded. I'm running against this very problem now, sockets blocking while working with larger files..

So I'm wondering: is there anyone who does know the details on how to approach this asynchronously?

I'm relatively new to Python and my otherwise wonderfully progressing project is getting stuck on this very point :-/

Any help, suggestions or comments would be very welcome! :)

Bernhard Mulder 19 years, 8 months ago  # | flag

Use non-blocking writes. You have to buffer the output and set the socket to non-blocking.

You can see a working example of this if you download leo (at leo.sourceforge.net) and look at the http plugin. This code, adapted for Leo, is based on this recipe

Konstantin Andreev 19 years, 2 months ago  # | flag

EWOULDBLOCK. EWOULDBLOCK arrives when output buffer is full and can't accept your data. You should use select function to wait while OS sends your output buffer to other side...

Pierre Quentel (author) 19 years, 2 months ago  # | flag

New version. I have modified the socketStream class following the comments above : the socket is set to non-blocking, the output is buffered, and before sending data a select() is done to check if it is writable

I have made some performance tests with openload and found that this server is very fast, even with big files. I would appreciate comments on its speed and stability

Pierre Quentel (author) 19 years, 1 month ago  # | flag

Faster and faster. The recipe can be improved :

  • speed up the writable() method :

    return select.select([],[self.sock],[])[1]

(no use testing if self.sock is in the list)

  • make it more stable : for very large files, buffering everything is not a good solution. It's better to buffer the chuncks that arrive and try to get rid of them as soon as possible, before all the data has been received

An improved version can be found at http://clubs.voila.fr/vault/karrigell/Public/SimpleAsyncHTTPServer.py

Josiah Carlson 19 years, 1 month ago  # | flag

Using a cSringIO as a buffer is still not terribly fast. In fact, it is not significantly better than using a plain string with slicing.

One is far better off using a deque of strings (from 2.4, or this recipe http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/259179 ), with proper slicing and/or the use of buffer() objects.

You are also not exploiting the asyncore.loop() mechanism to handle writing to the asynchronous sockets at all, so calling the server asynchronous is misleading.

I've started converting what you have into something that is actually asynchronous, but it probably won't be done tonight (various social engagements will remove me from the task).

Josiah Carlson 19 years, 1 month ago  # | flag

Truely Asynchronous version. I've got a version that I know works for GET requests (no connection pipelining, there is something hokey with clients that request it). It handles all reads and writes to sockets using asyncore.loop(), and on the standard GET requests, doesn't read entire files from disk (allowing one to serve up unlimited-sized files without worry about memory consumption).

I ran my version in my personal projects folder, which contains around a gig of files, of sizes ranging from a few kilobytes, to 100 megabytes. I then performed two concurrent "wget -m http://localhost:8081" calls (in different paths) on Windows 2k (machine is a P4 2.8ghz with hyperthreading, 1.5 gigs memory...).

On small files, it was able to serve up 15-30 files/second. On large files (the 10+ meg files), it was able to serve up at 15+ megs/second (so says adding the speed reported by wget). The server never broke 7 megs of resident memory, and tended to hang below 10% processor utilization.

The fully async version is available here:

http://www.ics.uci.edu/~jcarlson/SimpleAsyncHTTPServer.py

Pierre Quentel (author) 18 years, 6 months ago  # | flag

Updated version. I take the opportunity of Josiah Carslon's more complete recipe to update this one. This is the version used in the current version of Karrigell (the link provided in my previous comment is broken)

David Weil 18 years ago  # | flag

2 things.. - I can't get post working - there are unused modules imported

is there a new version?

ivan dimitrov 17 years, 5 months ago  # | flag

how to add "Up to higher level directory" Hi, how to add a link to "Up to higher level directory" in the sub folders?