Welcome, guest | Sign In | My Account | Store | Cart

Walk a directory path, listing the files and directories in HTML format.

Python, 53 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
from __future__ import generators # needed for Python 2.2
import sys

def walktree(top = ".", depthfirst = True):
    """Walk the directory tree, starting from top. Credit to Noah Spurrier and Doug Fort."""
    import os, stat, types
    names = os.listdir(top)
    if not depthfirst:
        yield top, names
    for name in names:
        try:
            st = os.lstat(os.path.join(top, name))
        except os.error:
            continue
        if stat.S_ISDIR(st.st_mode):
            for (newtop, children) in walktree (os.path.join(top, name), depthfirst):
                yield newtop, children
    if depthfirst:
        yield top, names

def makeHTMLtable(top, depthfirst=False):
    from xml.sax.saxutils import escape # To quote out things like &
    ret = ['<table class="fileList">\n']
    for top, names in walktree(top):
        ret.append('   <tr><td class="directory">%s</td></tr>\n'%escape(top))
        for name in names:
            ret.append('   <tr><td class="file">%s</td></tr>\n'%escape(name))
    ret.append('</table>')
    return ''.join(ret) # Much faster than += method

def makeHTMLpage(top, depthfirst=False):
    return '\n'.join(['<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"',
                      '"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">',
                      '<html>'
                      '<head>',
                      '   <title>Search results</title>',
                      '   <style type="text/css">',
                      '      table.fileList { text-align: left; }',
                      '      td.directory { font-weight: bold; }',
                      '      td.file { padding-left: 4em; }',
                      '   </style>',
                      '</head>',
                      '<body>',
                      '<h1>Search Results</h1>',
                      makeHTMLtable(top, depthfirst),
                      '</body>',
                      '</html>'])
                   
if __name__ == '__main__':
    if len(sys.argv) == 2:
        top = sys.argv[1]
    else: top = '.'
    print makeHTMLpage(top)

It can sometimes be handy to get a look at an entire directory tree, and HTML is a good way to display it. This script goes through the directory tree using os.path.walk in a generator, and outputting the results as XHTML 1.1 with CSS. It is quite possible to use this module in your own programs, since you can get the table independantly of the rest of the HTML document and insert your own CSS. This could be useful in a file search, or a file manager, or any number of things.

References: Python standard library's os.path.walk Recipe 161542, which is where the code for the walktree() function comes from. Thanks!

5 comments

Anand 20 years, 5 months ago  # | flag

Any advantages. Does this have any advantage over the os.path.walk()

function ?

Thanks

-Anand

Anand 20 years, 5 months ago  # | flag

Any advantages. Does this have any advantage over the os.path.walk()

function ?

Thanks

-Anand

Dan Perl 19 years, 7 months ago  # | flag

I will use it. I am working on a framework/toolkit for file management (okay, here's a plug for it: pyfmf.sourceforge.net/default.html). I'm going to use some of the code in this recipe to generate an HTML listing. Thanks! I'll make sure to put in a credit for you, Peter.

Dave Pawson 19 years, 5 months ago  # | flag

A slight modification. I needed to retrieve all files named note[0-9]*.txt.

Based on your work it is as below: I've retained your function names for comparison.

Many thanks. Much appreciated.

import sys

def walktree(top = ".", depthfirst = True):
    """Walk the directory tree, starting from top. Credit to Noah Spurrier and Doug Fort."""
    import os, stat, types
    names = os.listdir(top)
    if not depthfirst:
        yield top, names
    for name in names:
        try:
            st = os.lstat(os.path.join(top, name))
        except os.error:
            continue
        if stat.S_ISDIR(st.st_mode):
            for (newtop, children) in walktree (os.path.join(top, name), depthfirst):
                yield newtop, children
    if depthfirst:
        yield top, names
import re

def makeHTMLtable(top,filePattern, depthfirst=False):
    from xml.sax.saxutils import escape # To quote out things like &amp;
    ret = ['\n']
    for top, names in walktree(top):
        ret.append("%s\n" %escape(top))
        for name in names:
            if (re.match(filePattern,name) is not None):
                ret.append('\t%s\n'%escape(name))
    return ''.join(ret) # Much faster than += method

def makeHTMLpage(top, depthfirst=False):
    return '\n'.join([makeHTMLtable(top, depthfirst)])

print makeHTMLpage("/home/dpawson/enotes/","note[0-9]*.txt$")
pete moore 19 years, 3 months ago  # | flag

HTMLgen. I have a need for something similar in a documentation exercise. If this code was refactored to use HTMLgen, that would be even better for future reporting possibilities, re. graphical extras such as bar charts, etc.