Most viewed recipes tagged "htmlparser"http://code.activestate.com/recipes/tags/htmlparser/views/2013-12-14T00:28:36-08:00ActiveState Code RecipesSimple Web Crawler (Python)
2009-08-18T13:21:49-07:00manuelaraozhttp://code.activestate.com/recipes/users/4171484/http://code.activestate.com/recipes/576884-simple-web-crawler/
<p style="color: grey">
Python
recipe 576884
by <a href="/recipes/users/4171484/">manuelaraoz</a>
(<a href="/recipes/tags/crawler/">crawler</a>, <a href="/recipes/tags/htmlparser/">htmlparser</a>, <a href="/recipes/tags/urllib2/">urllib2</a>).
</p>
<p>A simple class that starts in a url and follows links to a desired depth.</p>
Pretty and Stated HTMLParsers (Python)
2013-12-14T00:28:36-08:00Ádám Szieberthhttp://code.activestate.com/recipes/users/4188745/http://code.activestate.com/recipes/578787-pretty-and-stated-htmlparsers/
<p style="color: grey">
Python
recipe 578787
by <a href="/recipes/users/4188745/">Ádám Szieberth</a>
(<a href="/recipes/tags/html/">html</a>, <a href="/recipes/tags/htmlparser/">htmlparser</a>, <a href="/recipes/tags/state/">state</a>).
Revision 2.
</p>
<p>Extensions of html.parser.HTMLParser().</p>
<p>PrettyHTMLParser() does not splits data into chuncks by HTML entities.
StatedHTMLParser() can have many state-dependent handlers which helps parsing HTML pages alot.</p>