Welcome, guest | Sign In | My Account | Store | Cart

On some task I need to collect file names under specified directory with distance from it. Standard os.walk function do not return depth value.

One solution -- find function which will calculate relative distance from top directory to file.

Another [presented] solution -- modify os.walk so it returns depth level as fourth tuple's value.

Python, 54 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from os.path import join, isdir, islink
from os import error, listdir

# modified os.walk() function from Python 2.4 standard library
def walk2(top, topdown=True, onerror=None, deeplevel=0): # fix 0
    """Modified directory tree generator.

    For each directory in the directory tree rooted at top (including top
    itself, but excluding '.' and '..'), yields a 4-tuple

        dirpath, dirnames, filenames, deeplevel

    dirpath is a string, the path to the directory.  dirnames is a list of
    the names of the subdirectories in dirpath (excluding '.' and '..').
    filenames is a list of the names of the non-directory files in dirpath.
    Note that the names in the lists are just names, with no path components.
    To get a full path (which begins with top) to a file or directory in
    dirpath, do os.path.join(dirpath, name). 

    ----------------------------------------------------------------------
    + deeplevel is 0-based deep level from top directory
    ----------------------------------------------------------------------
    ...

    """

    try:
        names = listdir(top)
    except error, err:
        if onerror is not None:
            onerror(err)
        return

    dirs, nondirs = [], []
    for name in names:
        if isdir(join(top, name)):
            dirs.append(name)
        else:
            nondirs.append(name)

    if topdown:
        yield top, dirs, nondirs, deeplevel # fix 1
    for name in dirs:
        path = join(top, name)
        if not islink(path):
            for x in walk2(path, topdown, onerror, deeplevel+1): # fix 2
                yield x
    if not topdown:
        yield top, dirs, nondirs, deeplevel # fix 3


if __name__ == '__main__':
    for top, dirs, files, deeplevel in walk2('.'):
        print deeplevel, ':', top

Empty.

2 comments

Gabriel Genellina 14 years, 6 months ago  # | flag

There is a simpler approach that doesn't require to alter/duplicate the original os.walk code.

os.walk invokes os.path.join to build the 'top' directory name on each iteration; the count of path separators (that is, os.sep) in each directory name is related to its depth. Just substract the starting count to obtain a relative depth.

Note that mixing \ and / in the initial directory (both are allowed on Windows) doesn't affect the result, neither using absolute or relative directory names.

top = 'd:\\Documents and Settings/gabriel'
startinglevel = top.count(os.sep)
for top, dirs, files in os.walk(top):
  level = top.count(os.sep) - startinglevel
  print level, ':', top
Denis Barmenkov (author) 14 years, 5 months ago  # | flag

Good approach, thank you!

My goal was a simple iterator which works in one line :)