Welcome, guest | Sign In | My Account | Store | Cart

Use John J. Lee's ClientCookie and ClientForm classes to easily access password-protected web applications. A group on yahoo.com is used as an example.

Python, 29 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import sys
sys.path.append('ClientCookie-1.0.3')
import ClientCookie
sys.path.append('ClientForm-0.1.17')
import ClientForm

# Create special URL opener (for User-Agent) and cookieJar
cookieJar = ClientCookie.CookieJar()

opener = ClientCookie.build_opener(ClientCookie.HTTPCookieProcessor(cookieJar))
opener.addheaders = [("User-agent","Mozilla/5.0 (compatible)")]
ClientCookie.install_opener(opener)
fp = ClientCookie.urlopen("http://login.yahoo.com")
forms = ClientForm.ParseResponse(fp)
fp.close()

# print forms on this page
for form in forms: 
    print "***************************"
    print form

form = forms[0]
form["login"]  = "yahoo-user-id" # use your userid
form["passwd"] = "password"      # use your password
fp = ClientCookie.urlopen(form.click())
fp.close()
fp = ClientCookie.urlopen("http://groups.yahoo.com/group/mygroup") # use your group
fp.readlines()
fp.close()

Many web applications require the user to fill out a login form. This recipe shows a very easy way to do it in Python so that you can get data from the site for scraping purposes.

I simply establish a persistent connection to a site (groups.yahoo.com) that requires you to fill out a form. The recipe should be easily adaptable to other sites such as eBay or PayPal. The task is easy using John J. Lee's CleintCookie and ClientForm classes.

I downloaded the classes from: http://wwwsearch.sourceforge.net/ClientCookie/src/ClientCookie-1.0.3.tar.gz http://wwwsearch.sourceforge.net/ClientForm/src/ClientForm-0.1.17.tar.gz

After untarring the tar.gz files, I used the above python to access my yahoo account (Look, Ma! No browser!)

I am using Python version 2.3.4 on Fedora Core 3.

Note that this kind of form-based authentication is nothing like http basic authentication. Therefore, you can't simply put the username and password in the url as in:

http://username:password@login.yahoo.com # This does not work.

Refer to Mike Foord's recipe at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/305288 to find out how to access sites that use http basic authentication.

All kudos to John J. Lee

3 comments

George Geller (author) 16 years, 8 months ago  # | flag

ssl support is required. This recipe requires the socket library to be compiled with ssl support. See http://docs.python.org/lib/module-httplib.html. The ssl support is already in the Python install on Fedora Core 3 (and I guessing other Linux installs as well). Based on a report by a Python user under Windows, at least on version of Python for Windows does not have ssl support.

Without ssl support you get an exeption that looks, in part, like:

File "C:\Python24\lib\urllib2.py", line 1053, in unknown_open

raise URLError('unknown url type: %s' % type)

URLError: urlopen error unknown url type: https

George

Lorne Walker 12 years, 9 months ago  # | flag

This worked like a charm for logging in to Yahoo! with python. Thanks!

rcmichelle 11 years, 2 months ago  # | flag

i like to use windows password software: http://password-genius.com/