This script reads the contents of a web file and copies them into a local file, named the same as the web file.
Python, 26 lines
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
#!/usr/bin/env python """File downloading from the web. """ def download(url): """Copy the contents of a file from a given URL to a local file. """ import urllib webFile = urllib.urlopen(url) localFile = open(url.split('/')[-1], 'w') localFile.write(webFile.read()) webFile.close() localFile.close() if __name__ == '__main__': import sys if len(sys.argv) == 2: try: download(sys.argv) except IOError: print 'Filename not found.' else: import os print 'usage: %s http://server.com/path/to/filename' % os.path.basename(sys.argv)
A fast alternative to the Unix "wget" bash command. Windows platforms do not support any program of similar nature, so this one is a good solution.
urllib. urllib.urlretrieve will do this. One problem with the code as given is that the entire file will be read into memory, and then written out to the file; briefly, but for a moment your program could get very memory hungry. urlretrieve will write the file out in chunks. shutil.copyfileobj will also copy between file objects in a chunked manner.
Tip appreciated. Thank you a lot Ian. I will bare this in mind.
What's its use ? Would it be better to take a optional local file name ? You current code might clobber an important local file.
It really depends on what the use context is.
wget is available for Windows. http://www.gnu.org/software/wget/
I use this extensively for downloading images, using something similar to this:
os.system('wget %s -a log.log' % fullurl)
Under windows images fail. You need to set the b flag in the write statement
localFile = open(url.split('/')[-1], 'wb')
Window isn't good at telling the difference we have to tell it.
Is there a handy way to list all files under an http url, then download them one-by-one using urlib?
Probably you could have used the os module
like import os
this is compatible with both windows and linux