Popular recipes tagged "crawler" but not "urllib"http://code.activestate.com/recipes/tags/crawler-urllib/2011-01-31T21:57:58-08:00ActiveState Code RecipesSimple Web Crawler (Python) 2009-08-18T13:21:49-07:00manuelaraozhttp://code.activestate.com/recipes/users/4171484/http://code.activestate.com/recipes/576884-simple-web-crawler/ <p style="color: grey"> Python recipe 576884 by <a href="/recipes/users/4171484/">manuelaraoz</a> (<a href="/recipes/tags/crawler/">crawler</a>, <a href="/recipes/tags/htmlparser/">htmlparser</a>, <a href="/recipes/tags/urllib2/">urllib2</a>). </p> <p>A simple class that starts in a url and follows links to a desired depth.</p> Simple Web Crawler (Python) 2011-01-31T21:57:58-08:00James Millshttp://code.activestate.com/recipes/users/4167757/http://code.activestate.com/recipes/576551-simple-web-crawler/ <p style="color: grey"> Python recipe 576551 by <a href="/recipes/users/4167757/">James Mills</a> (<a href="/recipes/tags/crawler/">crawler</a>, <a href="/recipes/tags/network/">network</a>, <a href="/recipes/tags/parsing/">parsing</a>, <a href="/recipes/tags/web/">web</a>). Revision 2. </p> <p>NOTE: This recipe has been updated with suggested improvements since the last revision.</p> <p>This is a simple web crawler I wrote to test websites and links. It will traverse all links found to any given depth.</p> <p>See --help for usage.</p> <p>I'm posting this recipe as this kind of problem has been asked on the Python Mailing List a number of times... I thought I'd share my simple little implementation based on the standard library and BeautifulSoup.</p> <p>--JamesMills</p>