ActiveState Code

Recipe 499305: Locating files throughout a directory tree


os.walk is a very nice replacement for os.path.walk, which I never did feel comfortable with. There's one very common pattern of usage, though, which still benefits from a simple helper-function; locating all files matching a given file-name pattern within a directory tree.

Python
1
2
3
4
5
6
7
8
import os, fnmatch

def locate(pattern, root=os.curdir):
    '''Locate all files matching supplied filename pattern in and below
    supplied root directory.'''
    for path, dirs, files in os.walk(os.path.abspath(root)):
        for filename in fnmatch.filter(files, pattern):
            yield os.path.join(path, filename)

Discussion

An example - I needed to identify all malformed XML files within my project. With the help of locate (and ElementTree) all I needed was:

<pre>from xml.parsers.expat import ExpatError import xml.etree.ElementTree as ElementTree

for xml in locate("*.xml"): try: ElementTree.parse(xml) except (SyntaxError, ExpatError): print xml, "\tBADLY FORMED!" </pre> Thanks to Marius Gedminas and Tom Lynn for improvements incorporated in this code.

Comments

  1. 1. At 2:21 p.m. on 12 dec 2006, Lucas Schifferle said:

    a simple question. How could the script be modified so that it locates all files in the current (or specified) root directory only - not its sub-directories? Thanks.

  2. 2. At 3:07 p.m. on 12 dec 2006, Rafal Zawadzki said:

    os.listdir(). import os

    os.listdir()

  3. 3. At 3:20 p.m. on 12 dec 2006, Rafal Zawadzki said:

    naturally i mean listdir as a keyword only.

    import os, fnmatch
    
    def locate(pattern, root=os.curdir):
        '''Locate all files matching supplied filename pattern in and below
        supplied root directory.'''
        #for path, dirs, files in os.walk(os.path.abspath(root)):
        files = os.listdir(os.path.abspath(root))
        for filename in fnmatch.filter(files, pattern):
            yield filename
    
  4. 4. At 3:46 p.m. on 12 dec 2006, Lucas Schifferle said:

    a simple solution (?). Thanks for the help!

    A solution to locate files matching a pattern inside the root dir only (so not its sub-dirs) could be:

    def locate2(pattern, root=os.curdir):
        filename = fnmatch.filter(os.listdir(os.path.abspath(root)), pattern)
        for fn in filename:
            yield os.path.join(root, fn)
    
  5. 5. At 3:51 p.m. on 12 dec 2006, Lucas Schifferle said:

    writing at the same time... Thanks again for the help!

  6. 6. At 7:43 a.m. on 10 jan 2007, Simon Brunning (the author) said:

    Locating files in a single directory. The glob module in Python's standard library will find files matching a pattern in a single directory. Check out the docs.

  7. 7. At 7:33 a.m. on 15 mar 2008, Martin Laloux said:

    glob example. import os, glob

    dor = the path you want

    for dir, subdir, files in os.walk(dor):

    for file in files:
    
    
              if glob.fnmatch.fnmatch(file,"*.txt"):
    
    
                   do what you want
    

Sign in to comment