ActiveState Code

Recipe 200131: Recursive directory listing in HTML


Walk a directory path, listing the files and directories in HTML format.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
from __future__ import generators # needed for Python 2.2
import sys

def walktree(top = ".", depthfirst = True):
    """Walk the directory tree, starting from top. Credit to Noah Spurrier and Doug Fort."""
    import os, stat, types
    names = os.listdir(top)
    if not depthfirst:
        yield top, names
    for name in names:
        try:
            st = os.lstat(os.path.join(top, name))
        except os.error:
            continue
        if stat.S_ISDIR(st.st_mode):
            for (newtop, children) in walktree (os.path.join(top, name), depthfirst):
                yield newtop, children
    if depthfirst:
        yield top, names

def makeHTMLtable(top, depthfirst=False):
    from xml.sax.saxutils import escape # To quote out things like &
    ret = ['<table class="fileList">\n']
    for top, names in walktree(top):
        ret.append('   <tr><td class="directory">%s</td></tr>\n'%escape(top))
        for name in names:
            ret.append('   <tr><td class="file">%s</td></tr>\n'%escape(name))
    ret.append('</table>')
    return ''.join(ret) # Much faster than += method

def makeHTMLpage(top, depthfirst=False):
    return '\n'.join(['<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"',
                      '"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">',
                      '<html>'
                      '<head>',
                      '   <title>Search results</title>',
                      '   <style type="text/css">',
                      '      table.fileList { text-align: left; }',
                      '      td.directory { font-weight: bold; }',
                      '      td.file { padding-left: 4em; }',
                      '   </style>',
                      '</head>',
                      '<body>',
                      '<h1>Search Results</h1>',
                      makeHTMLtable(top, depthfirst),
                      '</body>',
                      '</html>'])
                   
if __name__ == '__main__':
    if len(sys.argv) == 2:
        top = sys.argv[1]
    else: top = '.'
    print makeHTMLpage(top)

Discussion

It can sometimes be handy to get a look at an entire directory tree, and HTML is a good way to display it. This script goes through the directory tree using os.path.walk in a generator, and outputting the results as XHTML 1.1 with CSS. It is quite possible to use this module in your own programs, since you can get the table independantly of the rest of the HTML document and insert your own CSS. This could be useful in a file search, or a file manager, or any number of things.

References: Python standard library's os.path.walk Recipe 161542, which is where the code for the walktree() function comes from. Thanks!

Comments

  1. 1. At 6:49 a.m. on 14 oct 2003, Anand Balachandran Pillai said:

    Any advantages. Does this have any advantage over the os.path.walk()

    function ?

    Thanks

    -Anand

  2. 2. At 6:49 a.m. on 14 oct 2003, Anand Balachandran Pillai said:

    Any advantages. Does this have any advantage over the os.path.walk()

    function ?

    Thanks

    -Anand

  3. 3. At 9 a.m. on 27 aug 2004, Dan Perl said:

    I will use it. I am working on a framework/toolkit for file management (okay, here's a plug for it: pyfmf.sourceforge.net/default.html). I'm going to use some of the code in this recipe to generate an HTML listing. Thanks! I'll make sure to put in a credit for you, Peter.

  4. 4. At 10:22 a.m. on 9 oct 2004, Dave Pawson said:

    A slight modification. I needed to retrieve all files named note[0-9]*.txt.

    Based on your work it is as below: I've retained your function names for comparison.

    Many thanks. Much appreciated.

    import sys
    
    def walktree(top = ".", depthfirst = True):
        """Walk the directory tree, starting from top. Credit to Noah Spurrier and Doug Fort."""
        import os, stat, types
        names = os.listdir(top)
        if not depthfirst:
            yield top, names
        for name in names:
            try:
                st = os.lstat(os.path.join(top, name))
            except os.error:
                continue
            if stat.S_ISDIR(st.st_mode):
                for (newtop, children) in walktree (os.path.join(top, name), depthfirst):
                    yield newtop, children
        if depthfirst:
            yield top, names
    import re
    
    def makeHTMLtable(top,filePattern, depthfirst=False):
        from xml.sax.saxutils import escape # To quote out things like &amp;
        ret = ['\n']
        for top, names in walktree(top):
            ret.append("%s\n" %escape(top))
            for name in names:
                if (re.match(filePattern,name) is not None):
                    ret.append('\t%s\n'%escape(name))
        return ''.join(ret) # Much faster than += method
    
    def makeHTMLpage(top, depthfirst=False):
        return '\n'.join([makeHTMLtable(top, depthfirst)])
    
    print makeHTMLpage("/home/dpawson/enotes/","note[0-9]*.txt$")
    
  5. 5. At 6:02 a.m. on 18 dec 2004, pete moore said:

    HTMLgen. I have a need for something similar in a documentation exercise. If this code was refactored to use HTMLgen, that would be even better for future reporting possibilities, re. graphical extras such as bar charts, etc.

Sign in to comment