Welcome, guest | Sign In | My Account | Store | Cart

Python program to download YouTube video from command line. Originally posted in: https://github.com/krishnasun82/youtap

Usage: python youtap.py "<youtube-link>"

The reason for giving the link in double-quotes is that sometimes the link contain '&'(ampersand), which the UNIX interprets as "run the program in background"

Python, 81 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#!/usr/bin/python

__author__ = ('Sundar Srinivasan')

import re
import sys
import urllib2

def getVideoUrl(content):
    fmtre = re.search('(?<=fmt_url_map=).*', content)
    grps = fmtre.group(0).split('&amp;')
    vurls = urllib2.unquote(grps[0])
    videoUrl = None
    for vurl in vurls.split('|'):
        if vurl.find('itag=5') > 0:
            return vurl
    return None

def getTitle(content):
    title = content.split('</title>', 1)[0].split('<title>')[1]
    return sanitizeTitle(title)

def sanitizeTitle(rawtitle):
    rawtitle = urllib2.unquote(rawtitle)
    lines = rawtitle.split('\n')
    title = ''
    for line in lines:
        san = unicode(re.sub('[^\w\s-]', '', line).strip())
        san = re.sub('[-\s]+', '_', san)
        title = title + san
    ffr = title[:4]
    title = title[5:].split(ffr, 1)[0]
    return title

def downloadVideo(f, resp):
    totalSize = int(resp.info().getheader('Content-Length').strip())
    currentSize = 0
    CHUNK_SIZE = 32768

    while True:
        data = resp.read(CHUNK_SIZE)

        if not data:
            break
        currentSize += len(data)
        f.write(data)

        print('============> ' + \
                  str(round(float(currentSize*100)/totalSize, 2)) + \
                  '% of ' + str(totalSize) + ' bytes')
        if currentSize >= totalSize:
            break
    return

if __name__ == '__main__':
    if len(sys.argv) < 2:
        print("Usage: python youtap.py \"<youtube-url>\"")
        exit(1)
    urlname = sys.argv[1].split('&', 1)[0]
    print('Downloading: ' + urlname)
    try: 
        resp = urllib2.urlopen(urlname)
    except urllib2.HTTPError:
        print('Bad URL: 404')
        exit(1)
    content = resp.read()
    videoUrl = getVideoUrl(content)
    if not videoUrl:
        print('Video URL cannot be found')
        exit(1)
    title = getTitle(content)
    filename = title + '.flv'
    print('Creating file: ' + filename)
    f = open(filename, 'wb')
    print('Download begins...')

    ## Download video
    video = urllib2.urlopen(videoUrl)
    downloadVideo(f, video)
    f.flush()
    f.close()

Python program to download YouTube video from command line. Originally posted in: https://github.com/krishnasun82/youtap

Usage: python youtap.py "<youtube-link>"

The reason for giving the link in double-quotes is that sometimes the link contain '&'(ampersand), which the UNIX interprets as "run the program in background"

8 comments

Robin Becker 12 years, 9 months ago  # | flag

You might consider changing the open at line 78 to

f = open(filename, 'wb')

that way, windows users will get uncorrupted results.

Sundar Srinivasan (author) 12 years, 9 months ago  # | flag

That makes sense. Thanks, Robin! Changed it as you suggested.

Abhinav 12 years, 9 months ago  # | flag

Hi,

I am new to python, just came through the basics and now letting myself to involve in some serious stuffs. And right now i am studying your code youtap.py and try to understand each and every line of it. I need a gentle help from you and other guys if you guys could please explain me the downloadVideo() function. I am not able to comprehend it, I know it takes your time but if possible please anyone can explain.

Regards, Abhinav

Abhinav 12 years, 8 months ago  # | flag

This code is not working any more for me, i was successful in downloading few videos but after somedays i again tried to download but i am constantly receiving below error:

Traceback (most recent call last): File "Youtap.py", line 69, in <module> videoUrl = getVideoUrl(content) File "Youtap.py", line 10, in getVideoUrl grps = fmtre.group(0).split('&') AttributeError: 'NoneType' object has no attribute 'group'

Please help!

Regards, Abhinav

Sundar Srinivasan (author) 12 years, 8 months ago  # | flag

@Abhinav The search result for the regular expression: "(?<=fmt_url_map=).*" returns None or not found. Is it doing this for all videos or some? I will look at this when I have time.

David Adler 11 years, 6 months ago  # | flag

I think it is doing this for all videos where do we now find the source?

Sundar Srinivasan (author) 11 years, 6 months ago  # | flag

@David, This code was written long time ago. This has worked, since the YouTube videos were flash-based. Now I guess YouTube has changed all or most of their videos to non-flash format.

where do we now find the source? Source code is right here: https://github.com/krishnasun82/youtap. Please feel free to change it. You may want to download the HTML of the YouTube page and look for "fmt_url_map". It might have so happened that they just changed their HTML format. If that is the case, the change might be very simple.

p@ntut$ 11 years, 6 months ago  # | flag

Youtube changed its service. Please try this script code.activestate.com/recipes/578288-tubenick-download-youtube-videos/?in=user-4183895