Most viewed recipes tagged "web"http://code.activestate.com/recipes/tags/web/views/2014-03-08T17:34:38-08:00ActiveState Code RecipesConvert string to hex (Python)
2006-08-18T05:17:53-07:00Mykola Kharechkohttp://code.activestate.com/recipes/users/2950701/http://code.activestate.com/recipes/496969-convert-string-to-hex/
<p style="color: grey">
Python
recipe 496969
by <a href="/recipes/users/2950701/">Mykola Kharechko</a>
(<a href="/recipes/tags/web/">web</a>).
</p>
<p>Qoute string converting each char to hex repr and back</p>
Http client to POST using multipart/form-data (Python)
2002-08-23T07:56:39-07:00Wade Leftwichhttp://code.activestate.com/recipes/users/98656/http://code.activestate.com/recipes/146306-http-client-to-post-using-multipartform-data/
<p style="color: grey">
Python
recipe 146306
by <a href="/recipes/users/98656/">Wade Leftwich</a>
(<a href="/recipes/tags/web/">web</a>).
</p>
<p>A scripted web client that will post data to a site as if from a form using ENCTYPE="multipart/form-data". This is typically used to upload files, but also gets around a server's (e.g. ASP's) limitation on the amount of data that can be accepted via a standard POST (application/x-www-form-urlencoded).</p>
Simple HTTP server supporting SSL secure communications (Python)
2008-08-02T16:04:56-07:00Sebastien Martinihttp://code.activestate.com/recipes/users/2637141/http://code.activestate.com/recipes/442473-simple-http-server-supporting-ssl-secure-communica/
<p style="color: grey">
Python
recipe 442473
by <a href="/recipes/users/2637141/">Sebastien Martini</a>
(<a href="/recipes/tags/https/">https</a>, <a href="/recipes/tags/openssl/">openssl</a>, <a href="/recipes/tags/ssl/">ssl</a>, <a href="/recipes/tags/web/">web</a>).
Revision 8.
</p>
<p>This recipe describes how to set up a simple HTTP server supporting SSL secure communications. It extends the SimpleHTTPServer standard module to support the SSL protocol. With this recipe, only the server is authenticated while the client remains unauthenticated (i.e. the server will not request a client certificate). Thus, the client (typically the browser) will be able to verify the server identity and secure its communications with the server.</p>
<p>This recipe requires you already know the basis of SSL and how to set up <a href="http://www.openssl.org">OpenSSL</a>. This recipe is mostly derived from the examples provided with the <a href="http://pyopenssl.sourceforge.net">pyOpenSSL</a> package.</p>
<h5>In order to apply this recipe, follow these few steps:</h5>
<ol>
<li>Install the OpenSSL package in order to generate key and certificate. Note: you probably already have this package installed if you are under Linux, or *BSD.</li>
<li>Install the pyOpenSSL package, it is an OpenSSL library binding. You'll need to import this module for accessing OpenSSL's components.</li>
<li>Generate a self-signed certificate compounded of a certificate and a private key for your server with the following command (it outputs them both in a single file named server.pem):
<code>openssl req -new -x509 -keyout server.pem -out server.pem -days 365 -nodes</code></li>
<li>Assuming you saved this recipe in SimpleSecureHTTPServer.py, start the server (with the appropriate rights):
<code>python SimpleSecureHTTPServer.py</code></li>
<li>Finally, browse to <a href="https://localhost">https://localhost</a>, or <a href="https://localhost:port" rel="nofollow">https://localhost:port</a> if your server listens a different port than 443.</li>
</ol>
Minimal http upload cgi (Python)
2004-03-20T01:47:04-08:00Noah Spurrierhttp://code.activestate.com/recipes/users/103276/http://code.activestate.com/recipes/273844-minimal-http-upload-cgi/
<p style="color: grey">
Python
recipe 273844
by <a href="/recipes/users/103276/">Noah Spurrier</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 3.
</p>
<p>This is a bare-bones cgi file upload. It will display an upload form and save the uploaded files to disk.</p>
Simple Web Crawler (Python)
2011-01-31T21:57:58-08:00James Millshttp://code.activestate.com/recipes/users/4167757/http://code.activestate.com/recipes/576551-simple-web-crawler/
<p style="color: grey">
Python
recipe 576551
by <a href="/recipes/users/4167757/">James Mills</a>
(<a href="/recipes/tags/crawler/">crawler</a>, <a href="/recipes/tags/network/">network</a>, <a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/web/">web</a>).
Revision 2.
</p>
<p>NOTE: This recipe has been updated with suggested improvements since the last revision.</p>
<p>This is a simple web crawler I wrote to
test websites and links. It will traverse
all links found to any given depth.</p>
<p>See --help for usage.</p>
<p>I'm posting this recipe as this kind of
problem has been asked on the Python
Mailing List a number of times... I
thought I'd share my simple little
implementation based on the standard
library and BeautifulSoup.</p>
<p>--JamesMills</p>
HTTP basic authentication (Python)
2004-10-05T14:05:47-07:00Michael Foordhttp://code.activestate.com/recipes/users/1565518/http://code.activestate.com/recipes/305288-http-basic-authentication/
<p style="color: grey">
Python
recipe 305288
by <a href="/recipes/users/1565518/">Michael Foord</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 3.
</p>
<p>A script demonstrating how to manually do basic authentication over http.</p>
A simple XML-RPC server (Python)
2001-10-13T11:34:19-07:00Brian Quinlanhttp://code.activestate.com/recipes/users/118989/http://code.activestate.com/recipes/81549-a-simple-xml-rpc-server/
<p style="color: grey">
Python
recipe 81549
by <a href="/recipes/users/118989/">Brian Quinlan</a>
(<a href="/recipes/tags/web/">web</a>).
</p>
<p>This recipe demonstrates the creation of a simple XML-RPC server using the SimpleXMLRPCServer class. It requires either Python 2.2 or later or the XML-RPC package from PythonWare (<a href="http://www.pythonware.com/products/xmlrpc/index.htm" rel="nofollow">http://www.pythonware.com/products/xmlrpc/index.htm</a>) to run.</p>
HTMLTags - generate HTML in Python (Python)
2009-10-24T10:30:38-07:00Pierre Quentelhttp://code.activestate.com/recipes/users/1552957/http://code.activestate.com/recipes/366000-htmltags-generate-html-in-python/
<p style="color: grey">
Python
recipe 366000
by <a href="/recipes/users/1552957/">Pierre Quentel</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 11.
</p>
<p>The HTMLTags module defines a class for each valid HTML tag, written in uppercase letters. To create a piece of HTML, the general syntax is :</p>
<pre class="prettyprint"><code>t = TAG(innerHTML, key1=val1,key2=val2,...)
</code></pre>
<p>so that "print t" results in :</p>
<pre class="prettyprint"><code><TAG key1="val1" key2="val2" ...>innerHTML</TAG>
</code></pre>
<p>For instance :</p>
<pre class="prettyprint"><code>print A('bar', href="foo") ==> <A href="foo">bar</A>
</code></pre>
E-mail Address Validation (Python)
2001-07-27T13:37:26-07:00Mark Nenadovhttp://code.activestate.com/recipes/users/114221/http://code.activestate.com/recipes/65215-e-mail-address-validation/
<p style="color: grey">
Python
recipe 65215
by <a href="/recipes/users/114221/">Mark Nenadov</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 5.
</p>
<p>This function simply validates an e-mail address. Ignore this recepie and go to my "StringValidator" recepie, which is a much better solution</p>
SSL Client Authentication over HTTPS (Python)
2002-02-28T13:43:27-08:00Rob Riggshttp://code.activestate.com/recipes/users/217820/http://code.activestate.com/recipes/117004-ssl-client-authentication-over-https/
<p style="color: grey">
Python
recipe 117004
by <a href="/recipes/users/217820/">Rob Riggs</a>
(<a href="/recipes/tags/web/">web</a>).
</p>
<p>A 16-line python application that demonstrates SSL client authentication over HTTPS. We also explain the basics of how to set up Apache to require SSL client authentication. This assumes at least Python-2.2 compiled with SSL support, and Apache with mod_ssl.</p>
Calculating the distance between zip codes (Python)
2006-04-25T20:40:00-07:00Kevin Ryanhttp://code.activestate.com/recipes/users/1654599/http://code.activestate.com/recipes/393241-calculating-the-distance-between-zip-codes/
<p style="color: grey">
Python
recipe 393241
by <a href="/recipes/users/1654599/">Kevin Ryan</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 2.
</p>
<p>I came across the mention of a formula labeled "The Great Circle Distance Formula" that purported to calculate the distance between any two points on the earth given their longitude and latitude points (the reference was in a Linux Magazine article). So, I looked up some information and cooked up a Python version of the calculation. There are references in the code where you can obtain approximate zip code data for free (e.g., if you wanted to enhance your website by adding a "Search within x mi's" feature), as well as references to the GCDF if you have further interest. Enjoy!</p>
<p>04/25/2006 update: I've decided to update this recipe with an object oriented bent where the information is cached once the object is instantiated. I've also added command line access to automatically download the zipcode file from the census website (just use 'python zips.py -d' and it will download a copy to your harddrive under 'zips.txt'). Lastly, I've added some unit testing so that if any future changes are made this will automatically run and tell me if anything pops out as wrong.</p>
Simple XML RPC server over HTTPS (Python)
2006-06-07T07:17:34-07:00Laszlo Nagyhttp://code.activestate.com/recipes/users/2914829/http://code.activestate.com/recipes/496786-simple-xml-rpc-server-over-https/
<p style="color: grey">
Python
recipe 496786
by <a href="/recipes/users/2914829/">Laszlo Nagy</a>
(<a href="/recipes/tags/web/">web</a>).
</p>
<p>Simple program that demonstrates how to write an XMLRCP server that uses https for transporting XML data.</p>
cookielib Example (Python)
2004-12-28T11:26:41-08:00Michael Foordhttp://code.activestate.com/recipes/users/1565518/http://code.activestate.com/recipes/302930-cookielib-example/
<p style="color: grey">
Python
recipe 302930
by <a href="/recipes/users/1565518/">Michael Foord</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 2.
</p>
<p>cookielib is a library new to Python 2.4
Prior to Python 2.4 it existed as ClientCookie, but it's not a drop in replacement - some of the function of ClientCookie has been moved into urllib2.</p>
<p>This example shows code for fetching URIs (with cookie handling - including loading and saving) that will work unchanged on :
a machine with python 2.4 (and cookielib)
a machine with ClientCookie installed
a machine with neither
(Obviously on the machine with neither the cookies won't be handled or saved).</p>
<p>Where either cookielib or ClientCookie is available the cookies will be saved in a file.
If that file exists already the cookies will first be loaded from it.
The file format is a useful plain text format and the attributes of each cookie is accessible in the Cookiejar instance (once loaded).</p>
<p>This may be helpful to those just using ClientCookie as the ClientCookie documentation doesn't appear to document the LWPCookieJar class which is needed for saving and loading cookies.</p>
Python FTP Client (Python)
2007-06-21T12:13:54-07:00N Shttp://code.activestate.com/recipes/users/4040791/http://code.activestate.com/recipes/521925-python-ftp-client/
<p style="color: grey">
Python
recipe 521925
by <a href="/recipes/users/4040791/">N S</a>
(<a href="/recipes/tags/web/">web</a>).
</p>
<p>This is a lightweight FTP client. I find it useful for my purposes. You may notice some weird code, but I assure you, it is legitimate. Python was being stubborn, so I had to circumvent some of the rules.</p>
Simple HTTP server based on asyncore/asynchat (Python)
2005-10-16T08:26:59-07:00Pierre Quentelhttp://code.activestate.com/recipes/users/1552957/http://code.activestate.com/recipes/259148-simple-http-server-based-on-asyncoreasynchat/
<p style="color: grey">
Python
recipe 259148
by <a href="/recipes/users/1552957/">Pierre Quentel</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 8.
</p>
<p>A simple HTTP Server, intended to be as simple as the standard module SimpleHTTPServer, built upon the asyncore/asynchat modules (uses non-blocking sockets). Provides a Server (copied from medusa http_server) and a RequestHandler class. RequestHandler handles both GET and POST methods and inherits SimpleHTTPServer.SimpleHTTPRequestHandler</p>
<p>It can be easily extended by overriding the handle_data() method in the RequestHandler class</p>
My first application server (Python)
2009-02-23T11:53:57-08:00Pierre Quentelhttp://code.activestate.com/recipes/users/1552957/http://code.activestate.com/recipes/392879-my-first-application-server/
<p style="color: grey">
Python
recipe 392879
by <a href="/recipes/users/1552957/">Pierre Quentel</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 8.
</p>
<p>For those who want to start dynamic web programming, but don't know what to choose among the many Python web frameworks, this program might be a good starting point</p>
<p>ScriptServer is a minimalist application server, handling both GET and POST requests, including multipart/form-data for file uploads, HTTP redirections, and with an in-memory session management. It can run Python scripts and template files using the standard string substitution format</p>
<p>The scripts are run in the same process as the server, avoiding the CGI overhead. The environment variables are provided in the namespace where the script runs</p>
<p>To start the server, run </p>
<pre class="prettyprint"><code>python ScriptServer.py
</code></pre>
<p>In your web browser, enter <a href="http://localhost" rel="nofollow">http://localhost</a>, this will show you a listing of the directory. Add the scripts in the same directory as ScriptServer</p>
Simple AJAX with javascript JSON parser (Python)
2005-10-14T06:58:34-07:00Wensheng Wanghttp://code.activestate.com/recipes/users/1513433/http://code.activestate.com/recipes/440637-simple-ajax-with-javascript-json-parser/
<p style="color: grey">
Python
recipe 440637
by <a href="/recipes/users/1513433/">Wensheng Wang</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 2.
</p>
<p>This JSON parser works well with stringified Python list or dictionary. It is from <a href="http://json.org" rel="nofollow">json.org</a> javacript json parser with small modification.</p>
HTML Scraper (Python)
2004-09-06T08:18:49-07:00Michael Foordhttp://code.activestate.com/recipes/users/1565518/http://code.activestate.com/recipes/286269-html-scraper/
<p style="color: grey">
Python
recipe 286269
by <a href="/recipes/users/1565518/">Michael Foord</a>
(<a href="/recipes/tags/web/">web</a>).
Revision 5.
</p>
<p>A simple HTML 'parser' that will 'read' through an HTML file and call functions on data and tags etc.
Useful if you need to implement a straightforward parser that just extracts information from the file <em>or</em> modifies tags etc.</p>
<p>Shouldn't choke on bad HTML.</p>
A Simple Webcrawler (Python)
2012-03-03T02:37:30-08:00Johnhttp://code.activestate.com/recipes/users/4181142/http://code.activestate.com/recipes/578060-a-simple-webcrawler/
<p style="color: grey">
Python
recipe 578060
by <a href="/recipes/users/4181142/">John</a>
(<a href="/recipes/tags/crawler/">crawler</a>, <a href="/recipes/tags/html/">html</a>, <a href="/recipes/tags/page/">page</a>, <a href="/recipes/tags/parser/">parser</a>, <a href="/recipes/tags/scraping/">scraping</a>, <a href="/recipes/tags/urllib/">urllib</a>, <a href="/recipes/tags/urlopen/">urlopen</a>, <a href="/recipes/tags/web/">web</a>).
</p>
<p>This is my simple web crawler. It takes as input a list of seed pages (web urls) and 'scrapes' each page of all its absolute path links (i.e. links in the format <a href="http://" rel="nofollow">http://</a>) and adds those to a dictionary. The web crawler can take all the links found in the seed pages and then scrape those as well. You can continue scraping as deep as you like. You can control how "deep you go" by specifying the depth variable passed into the WebCrawler class function start_crawling(seed_pages,depth). Think of the depth as the recursion depth (or the number of web pages deep you go before returning back up the tree).</p>
<p>To make this web crawler a little more interesting I added some bells and whistles. I added the ability to pass into the WebCrawler class constructor a regular expression object. The regular expression object is used to "filter" the links found during scraping. For example, in the code below you will see:</p>
<p>cnn_url_regex = re.compile('(?<=[.]cnn)[.]com') # cnn_url_regex is a regular expression object</p>
<p>w = WebCrawler(cnn_url_regex)</p>
<p>This particular regular expression says:</p>
<p>1) Find the first occurence of the string '.com'</p>
<p>2) Then looking backwards from where '.com' was found it attempts to find '.cnn'</p>
<p>Why do this?</p>
<p>You can control where the crawler crawls. In this case I am constraining the crawler to operate on webpages within cnn.com.</p>
<p>Another feature I added was the ability to parse a given page looking for specific html tags. I chose as an example the <h1> tag. Once a <h1> tag is found I store all the words I find in the tag in a dictionary that gets associated with the page url.</p>
<p>Why do this?</p>
<p>My thought was that if I scraped the page for text I could eventually use this data for a search engine request. Say I searched for 'Lebron James'. And suppose that one of the pages my crawler scraped found an article that mentions Lebron James many times. In response to a search request I could return the link with the Lebron James article in it.</p>
<p>The web crawler is described in the WebCrawler class. It has 2 functions the user should call:</p>
<p>1) start_crawling(seed_pages,depth)</p>
<p>2) print_all_page_text() # this is only used for debug purposes</p>
<p>The rest of WebCrawler's functions are internal functions that should not be called by the user (think private in C++).</p>
<p>Upon construction of a WebCrawler object, it creates a MyHTMLParser object. The MyHTMLParser class inherits from the built-in Python class HTMLParser. I use the MyHTMLParser object when searching for the <h1> tag. The MyHTMLParser class creates instances of a helper class named Tag. The tag class is used in creating a "linked list" of tags.</p>
<p>So to get started with WebCrawler make sure to use Python 2.7.2. Enter the code a piece at a time into IDLE in the order displayed below. This ensures that you import libs before you start using them.</p>
<p>Once you have entered all the code into IDLE, you can start crawling the 'interwebs' by entering the following:</p>
<p>import re</p>
<p>cnn_url_regex = re.compile('(?<=[.]cnn)[.]com') </p>
<p>w = WebCrawler(cnn_url_regex)</p>
<p>w.start_crawling(['http://www.cnn.com/2012/02/24/world/americas/haiti-pm-resigns/index.html?hpt=hp_t3'],1)</p>
<p>Of course you can enter any page you want. But the regular expression object is already setup to filter on <a href="http://cnn.com" rel="nofollow">cnn.com</a>. Remember the second parameter passed into the start_crawling function is the recursion depth.</p>
<p>Happy Crawling!</p>
Composing a POSTable HTTP request with multipart/form-data Content-Type to simulate a form/file upload. (Python)
2014-03-08T17:34:38-08:00István Pásztorhttp://code.activestate.com/recipes/users/4189380/http://code.activestate.com/recipes/578846-composing-a-postable-http-request-with-multipartfo/
<p style="color: grey">
Python
recipe 578846
by <a href="/recipes/users/4189380/">István Pásztor</a>
(<a href="/recipes/tags/field/">field</a>, <a href="/recipes/tags/file/">file</a>, <a href="/recipes/tags/form/">form</a>, <a href="/recipes/tags/html/">html</a>, <a href="/recipes/tags/httpclient/">httpclient</a>, <a href="/recipes/tags/mime/">mime</a>, <a href="/recipes/tags/multipart/">multipart</a>, <a href="/recipes/tags/post/">post</a>, <a href="/recipes/tags/upload/">upload</a>, <a href="/recipes/tags/web/">web</a>).
Revision 5.
</p>
<p>This code is useful if you are using a http client and you want to simulate a request similar to that of a browser that submits a form containing several input fields (including file upload fields). I've used this with python 2.x.</p>