ActiveState Code

Recipe 101276: Check web page exists


For when you need to check a web page is still working.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
from httplib import HTTP 
from urlparse import urlparse 

def checkURL(url): 
     p = urlparse(url) 
     h = HTTP(p[1]) 
     h.putrequest('HEAD', p[2]) 
     h.endheaders() 
     if h.getreply()[0] == 200: return 1 
     else: return 0 

if __name__ == '__main__': 
     assert checkURL('http://slashdot.org') 
     assert not checkURL('http://slashdot.org/notadirectory') 

Discussion

This check for things like redirects. It calls the HEAD which means the entire document isnt downloaded, just checked for existence. If the HTTP status code is not 200 it returns an error.

Sign in to comment