os.walk is a very nice replacement for os.path.walk, which I never did feel comfortable with. There's one very common pattern of usage, though, which still benefits from a simple helper-function; locating all files matching a given file-name pattern within a directory tree.
1 2 3 4 5 6 7 8
import os, fnmatch def locate(pattern, root=os.curdir): '''Locate all files matching supplied filename pattern in and below supplied root directory.''' for path, dirs, files in os.walk(os.path.abspath(root)): for filename in fnmatch.filter(files, pattern): yield os.path.join(path, filename)
An example - I needed to identify all malformed XML files within my project. With the help of locate (and ElementTree) all I needed was:
from xml.parsers.expat import ExpatError import xml.etree.ElementTree as ElementTree for xml in locate("*.xml"): try: ElementTree.parse(xml) except (SyntaxError, ExpatError): print xml, "\tBADLY FORMED!"
Thanks to Marius Gedminas and Tom Lynn for improvements incorporated in this code.
a simple question. How could the script be modified so that it locates all files in the current (or specified) root directory only - not its sub-directories? Thanks.
os.listdir(). import os
naturally i mean listdir as a keyword only.
a simple solution (?). Thanks for the help!
A solution to locate files matching a pattern inside the root dir only (so not its sub-dirs) could be:
writing at the same time... Thanks again for the help!
Locating files in a single directory. The glob module in Python's standard library will find files matching a pattern in a single directory. Check out the docs.
glob example. import os, glob
dor = the path you want
for dir, subdir, files in os.walk(dor):
The glob module is much simpler than that:
Hi, I want to search folders which are having a general pattern like: pattern = "[0-9]+.[0-9]+.[0-9]+.[0-9]+" It means a folder name could be anything constitute of nos and four quads. example:184.108.40.206
But i unable to search this using above code,by defining the pattern as above mentioned by me. Help me please ASAP.
@Arup What you have there is a regex pattern rather than a glob pattern, so you'll need methods from the re module rather than the fnmatch module. You'll need to adapt locate() accordingly.
@Simon, Thanks for pointing me this. Can you please help me to locate folders with pattern="[0-9]+.[0-9]+.[0-9]+.[0-9]+" using re module.
Arup How about this?
for filename in glob.glob(...): if filename.replace(".", "").isdigit(): print filename
How can I use / modufy the above script to search in a way such that I can locate a file under directories with certain pattern under a root directory
What I mean is if I have a directory installedApps under which I have a.ear, b.ear,... x.ear, y.ear, z.ear directories along with other directories at the same level and i want to search for a file pattern web*.xml only under those subdirectories under the root without traversing other directories at the same level.
Can this be done?
Something like this? (Untested.)