Welcome, guest | Sign In | My Account | Store | Cart

Downloading favicon of the website and save it to file, if website doesn't have favicon using default one.

Python, 43 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import sys
import shutil
import urllib2
import lxml.html


HEADERS = {
    'User-Agent': 'urllib2 (Python %s)' % sys.version.split()[0],
    'Connection': 'close',
}


def get_favicon(url, path='favicon.ico', alt_icon_path='alticon.ico'):

    if not url.endswith('/'):
        url += '/'

    request = urllib2.Request(url + 'favicon.ico', headers=HEADERS)
    try:
        icon = urllib2.urlopen(request).read()
    except(urllib2.HTTPError, urllib2.URLError):
        reqest = urllib2.Request(url, headers=HEADERS)
        try:
            content = urllib2.urlopen(request).read(2048) # 2048 bytes should be enought for most of websites
        except(urllib2.HTTPError, urllib2.URLError):
            shutil.copyfile(alt_icon_path, path)
            return
        icon_path = lxml.html.fromstring(x).xpath(
            '//link[@rel="icon" or @rel="shortcut icon"]/@href'
        )
        if icon_path:
            request = urllib2.Request(url + icon_path[:1], headers=HEADERS)
            try:
                icon = urllib2.urlopen(request).read()
            except(urllib2.HTTPError, urllib2.URLError):
                shutil.copyfile(alt_icon_path, path)
                return
    
    open(path, 'wb').write(icon)


if __name__ == '__main__':
    get_favicon('http://code.activestate.com', 'favicon.ico')

1 comment

Jaime 7 years ago  # | flag

Favicon retrieval is a fun coding problem to solve, but with all the redirect following and page loads, it usually takes 3-4 seconds to retrieve a favicon. If you need icons faster, http://grabicon.com is a free service that lets you specify icon size, and generates unique defaults for sites that don't have an icon. This provides a uniform user experience for web/mobile applications because icons are guaranteed to be the size you requested, and never missing.

It's this simple:

http://grabicon.com/icon?domain=wikipedia.org

Or specify size, if you don't want the default 32 pixel icon:

http://grabicon.com/icon?domain=wikipedia.org&size=16

There's also a jQuery snippet to automatically add icons to the external links on your web page.