A simple HTTP Server, intended to be as simple as the standard module SimpleHTTPServer, built upon the asyncore/asynchat modules (uses non-blocking sockets). Provides a Server (copied from medusa http_server) and a RequestHandler class. RequestHandler handles both GET and POST methods and inherits SimpleHTTPServer.SimpleHTTPRequestHandler
It can be easily extended by overriding the handle_data() method in the RequestHandler class
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 | """Simple HTTP server based on the asyncore / asynchat framework
Under asyncore, every time a socket is created it enters a table which is
scanned through select calls by the asyncore.loop() function
All events (a client connecting to a server socket, a client sending data,
a server receiving data) is handled by the instances of classes derived
from asyncore.dispatcher
Here the server is represented by an instance of the Server class
When a client connects to it, its handle_accept() method creates an
instance of RequestHandler, one for each HTTP request. It is derived
from asynchat.async_chat, a class where incoming data on the connection
is processed when a "terminator" is received. The terminator can be :
- a string : here we'll use the string \r\n\r\n to handle the HTTP request
line and the HTTP headers
- an integer (n) : the data is processed when n bytes have been read. This
will be used for HTTP POST requests
The data is processed by a method called found_terminator. In RequestHandler,
found_terminator is first set to handle_request_line to handle the HTTP
request line (including the decoding of the query string) and the headers.
If the method is POST, terminator is set to the number of bytes to read
(the content-length header), and found_terminator is set to handle_post_data
After that, the handle_data() method is called and the connection is closed
Subclasses of RequestHandler only have to override the handle_data() method
"""
import asynchat, asyncore, socket, SimpleHTTPServer, select, urllib
import posixpath, sys, cgi, cStringIO, os, traceback, shutil
class CI_dict(dict):
"""Dictionary with case-insensitive keys
Replacement for the deprecated mimetools.Message class
"""
def __init__(self, infile, *args):
self._ci_dict = {}
lines = infile.readlines()
for line in lines:
k,v=line.split(":",1)
self._ci_dict[k.lower()] = self[k] = v.strip()
self.headers = self.keys()
def getheader(self,key,default=""):
return self._ci_dict.get(key.lower(),default)
def get(self,key,default=""):
return self._ci_dict.get(key.lower(),default)
def __getitem__(self,key):
return self._ci_dict[key.lower()]
def __contains__(self,key):
return key.lower() in self._ci_dict
class socketStream:
def __init__(self,sock):
"""Initiate a socket (non-blocking) and a buffer"""
self.sock = sock
self.buffer = cStringIO.StringIO()
self.closed = 1 # compatibility with SocketServer
def write(self, data):
"""Buffer the input, then send as many bytes as possible"""
self.buffer.write(data)
if self.writable():
buff = self.buffer.getvalue()
# next try/except clause suggested by Robert Brown
try:
sent = self.sock.send(buff)
except:
# Catch socket exceptions and abort
# writing the buffer
sent = len(data)
# reset the buffer to the data that has not yet be sent
self.buffer=cStringIO.StringIO()
self.buffer.write(buff[sent:])
def finish(self):
"""When all data has been received, send what remains
in the buffer"""
data = self.buffer.getvalue()
# send data
while len(data):
while not self.writable():
pass
sent = self.sock.send(data)
data = data[sent:]
def writable(self):
"""Used as a flag to know if something can be sent to the socket"""
return select.select([],[self.sock],[])[1]
class RequestHandler(asynchat.async_chat,
SimpleHTTPServer.SimpleHTTPRequestHandler):
protocol_version = "HTTP/1.1"
MessageClass = CI_dict
def __init__(self,conn,addr,server):
asynchat.async_chat.__init__(self,conn)
self.client_address = addr
self.connection = conn
self.server = server
# set the terminator : when it is received, this means that the
# http request is complete ; control will be passed to
# self.found_terminator
self.set_terminator ('\r\n\r\n')
self.rfile = cStringIO.StringIO()
self.found_terminator = self.handle_request_line
self.request_version = "HTTP/1.1"
# buffer the response and headers to avoid several calls to select()
self.wfile = cStringIO.StringIO()
def collect_incoming_data(self,data):
"""Collect the data arriving on the connexion"""
self.rfile.write(data)
def prepare_POST(self):
"""Prepare to read the request body"""
bytesToRead = int(self.headers.getheader('content-length'))
# set terminator to length (will read bytesToRead bytes)
self.set_terminator(bytesToRead)
self.rfile = cStringIO.StringIO()
# control will be passed to a new found_terminator
self.found_terminator = self.handle_post_data
def handle_post_data(self):
"""Called when a POST request body has been read"""
self.rfile.seek(0)
self.do_POST()
self.finish()
def do_GET(self):
"""Begins serving a GET request"""
# nothing more to do before handle_data()
self.body = {}
self.handle_data()
def do_POST(self):
"""Begins serving a POST request. The request data must be readable
on a file-like object called self.rfile"""
ctype, pdict = cgi.parse_header(self.headers.getheader('content-type'))
self.body = cgi.FieldStorage(fp=self.rfile,
headers=self.headers, environ = {'REQUEST_METHOD':'POST'},
keep_blank_values = 1)
self.handle_data()
def handle_data(self):
"""Class to override"""
f = self.send_head()
if f:
self.copyfile(f, self.wfile)
def handle_request_line(self):
"""Called when the http request line and headers have been received"""
# prepare attributes needed in parse_request()
self.rfile.seek(0)
self.raw_requestline = self.rfile.readline()
self.parse_request()
if self.command in ['GET','HEAD']:
# if method is GET or HEAD, call do_GET or do_HEAD and finish
method = "do_"+self.command
if hasattr(self,method):
getattr(self,method)()
self.finish()
elif self.command=="POST":
# if method is POST, call prepare_POST, don't finish yet
self.prepare_POST()
else:
self.send_error(501, "Unsupported method (%s)" %self.command)
def end_headers(self):
"""Send the blank line ending the MIME headers, send the buffered
response and headers on the connection, then set self.wfile to
this connection
This is faster than sending the response line and each header
separately because of the calls to select() in socketStream"""
if self.request_version != 'HTTP/0.9':
self.wfile.write("\r\n")
self.start_resp = cStringIO.StringIO(self.wfile.getvalue())
self.wfile = socketStream(self.connection)
self.copyfile(self.start_resp, self.wfile)
def handle_error(self):
traceback.print_exc(sys.stderr)
self.close()
def copyfile(self, source, outputfile):
"""Copy all data between two file objects
Set a big buffer size"""
shutil.copyfileobj(source, outputfile, length = 128*1024)
def finish(self):
"""Send data, then close"""
try:
self.wfile.finish()
except AttributeError:
# if end_headers() wasn't called, wfile is a StringIO
# this happens for error 404 in self.send_head() for instance
self.wfile.seek(0)
self.copyfile(self.wfile, socketStream(self.connection))
self.close()
class Server(asyncore.dispatcher):
"""Copied from http_server in medusa"""
def __init__ (self, ip, port,handler):
self.ip = ip
self.port = port
self.handler = handler
asyncore.dispatcher.__init__ (self)
self.create_socket (socket.AF_INET, socket.SOCK_STREAM)
self.set_reuse_addr()
self.bind ((ip, port))
# lower this to 5 if your OS complains
self.listen (1024)
def handle_accept (self):
try:
conn, addr = self.accept()
except socket.error:
self.log_info ('warning: server accept() threw an exception', 'warning')
return
except TypeError:
self.log_info ('warning: server accept() threw EWOULDBLOCK', 'warning')
return
# creates an instance of the handler class to handle the request/response
# on the incoming connexion
self.handler(conn,addr,self)
if __name__=="__main__":
# launch the server on the specified port
port = 8081
s=Server('',port,RequestHandler)
print "SimpleAsyncHTTPServer running on port %s" %port
try:
asyncore.loop(timeout=2)
except KeyboardInterrupt:
print "Crtl+C pressed. Shutting down."
|
The standard Python distribution provides two modules for asynchronous socket programming, asyncore and asynchat. Although it seemed very interesting, it took me some time to begin to understand how it works. The medusa framework is based on it, works very well but I'm afraid it does not provide a very clear documentation either
For an example of a server built upon SimpleAsyncHTTPServer, see Karrigell_async in the web development tool I work on, called Karrigell (http://karrigell.sourceforge.net)
socket blocking. The write method of SocketStream can throw a message on long files, because 'the socket operation can not complete without blocking'.
To fix that, the response should be written out asynchroneously. Not sure about the details.
socket blocking .. reloaded. I'm running against this very problem now, sockets blocking while working with larger files..
So I'm wondering: is there anyone who does know the details on how to approach this asynchronously?
I'm relatively new to Python and my otherwise wonderfully progressing project is getting stuck on this very point :-/
Any help, suggestions or comments would be very welcome! :)
Use non-blocking writes. You have to buffer the output and set the socket to non-blocking.
You can see a working example of this if you download leo (at leo.sourceforge.net) and look at the http plugin. This code, adapted for Leo, is based on this recipe
EWOULDBLOCK. EWOULDBLOCK arrives when output buffer is full and can't accept your data. You should use select function to wait while OS sends your output buffer to other side...
New version. I have modified the socketStream class following the comments above : the socket is set to non-blocking, the output is buffered, and before sending data a select() is done to check if it is writable
I have made some performance tests with openload and found that this server is very fast, even with big files. I would appreciate comments on its speed and stability
Faster and faster. The recipe can be improved :
speed up the writable() method :
return select.select([],[self.sock],[])[1]
(no use testing if self.sock is in the list)
An improved version can be found at http://clubs.voila.fr/vault/karrigell/Public/SimpleAsyncHTTPServer.py
Using a cSringIO as a buffer is still not terribly fast. In fact, it is not significantly better than using a plain string with slicing.
One is far better off using a deque of strings (from 2.4, or this recipe http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/259179 ), with proper slicing and/or the use of buffer() objects.
You are also not exploiting the asyncore.loop() mechanism to handle writing to the asynchronous sockets at all, so calling the server asynchronous is misleading.
I've started converting what you have into something that is actually asynchronous, but it probably won't be done tonight (various social engagements will remove me from the task).
Truely Asynchronous version. I've got a version that I know works for GET requests (no connection pipelining, there is something hokey with clients that request it). It handles all reads and writes to sockets using asyncore.loop(), and on the standard GET requests, doesn't read entire files from disk (allowing one to serve up unlimited-sized files without worry about memory consumption).
I ran my version in my personal projects folder, which contains around a gig of files, of sizes ranging from a few kilobytes, to 100 megabytes. I then performed two concurrent "wget -m http://localhost:8081" calls (in different paths) on Windows 2k (machine is a P4 2.8ghz with hyperthreading, 1.5 gigs memory...).
On small files, it was able to serve up 15-30 files/second. On large files (the 10+ meg files), it was able to serve up at 15+ megs/second (so says adding the speed reported by wget). The server never broke 7 megs of resident memory, and tended to hang below 10% processor utilization.
The fully async version is available here:
http://www.ics.uci.edu/~jcarlson/SimpleAsyncHTTPServer.py
Updated version. I take the opportunity of Josiah Carslon's more complete recipe to update this one. This is the version used in the current version of Karrigell (the link provided in my previous comment is broken)
2 things.. - I can't get post working - there are unused modules imported
is there a new version?
how to add "Up to higher level directory" Hi, how to add a link to "Up to higher level directory" in the sub folders?