This script recursively scans a given path and applies a cleaning 'action' to matching files and folders. By default files and folders matching the specified (.endswith) patterns are deleted. Alternatively, _quoted_ glob patterns can used with the '-g' or '--glob' option.
By design, the script lists targets and asks permission before applying cleaning actions. It should be easy to extend this script with further actions and also more intelligent pattern matching functions.
The getch (single key confirmation) functionality comes courtesy of http://code.activestate.com/recipes/134892/
To use it, place the script in your path and call it something like 'clean':
Usage: clean [options] patterns
deletes files/folder patterns:
clean .svn .pyc
clean -p /tmp/folder .svn .csv .bzr .pyc
clean -g "*.pyc"
clean -ng "*.py"
converts line endings from windows to unix:
clean -e .py
clean -e -p /tmp/folder .py
Options:
-h, --help show this help message and exit
-p PATH, --path=PATH set path
-n, --negated clean everything except specified patterns
-e, --endings clean line endings
-g, --glob clean with glob patterns
-v, --verbose
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 | #!/usr/bin/env python
"""
This script recursively scans a given path and applies a cleaning 'action'
to matching files and folders. By default files and folders matching the
specified (.endswith) patterns are deleted. Alternatively, _quoted_ glob
patterns can used with the '-g' option.
By design, the script lists targets and asks permission before applying
cleaning actions. It should be easy to extend this script with further
cleaning actions and more intelligent pattern matching techniques.
The getch (single key confirmation) functionality comes courtesy of
http://code.activestate.com/recipes/134892/
To use it, place the script in your path and call it something like 'clean':
Usage: clean [options] patterns
deletes files/folder patterns:
clean .svn .pyc
clean -p /tmp/folder .svn .csv .bzr .pyc
clean -g "*.pyc"
clean -ng "*.py"
converts line endings from windows to unix:
clean -e .py
clean -e -p /tmp/folder .py
Options:
-h, --help show this help message and exit
-p PATH, --path=PATH set path
-n, --negated clean everything except specified patterns
-e, --endings clean line endings
-v, --verbose
"""
from __future__ import print_function
import os, sys, shutil
from fnmatch import fnmatch
from optparse import OptionParser
from os.path import join, isdir, isfile
# to enable single-character confirmation of choices
try:
import sys, tty, termios
def getch(txt):
print(txt, end=' ')
fd = sys.stdin.fileno()
old_settings = termios.tcgetattr(fd)
try:
tty.setraw(sys.stdin.fileno())
ch = sys.stdin.read(1)
finally:
termios.tcsetattr(fd, termios.TCSADRAIN, old_settings)
return ch
except ImportError:
import msvcrt
def getch(txt):
print(txt, end=' ')
return msvcrt.getch()
# -----------------------------------------------------
# main class
class Cleaner(object):
"""recursively cleans patterns of files/directories
"""
def __init__(self, path, patterns):
self.path = path
self.patterns = patterns
self.matchers = {
# a matcher is a boolean function which takes a string and tries
# to match it against any one of the specified patterns,
# returning False otherwise
'endswith': lambda s: any(s.endswith(p) for p in patterns),
'glob': lambda s: any(fnmatch(s, p) for p in patterns),
}
self.actions = {
# action: (path_operating_func, matcher)
'endswith_delete': (self.delete, 'endswith'),
'glob_delete': (self.delete, 'glob'),
'convert': (self.clean_endings, 'endswith'),
}
self.targets = []
self.cum_size = 0.0
def __repr__(self):
return "<Cleaner: path:%s , patterns:%s>" % (
self.path, self.patterns)
def _apply(self, func, confirm=False):
"""applies a function to each target path
"""
i = 0
desc = func.__doc__.strip()
for target in self.targets:
if confirm:
question = "\n%s '%s' (y/n/q)? " % (desc, target)
answer = getch(question)
if answer in ['y', 'Y']:
func(target)
i += 1
elif answer in ['q']: #i.e. quit
break
else:
continue
else:
func(target)
i += 1
if i:
self.log("Applied '%s' to %s items (%sK)" % (
desc, i, int(round(self.cum_size/1024.0, 0))))
else:
self.log('No action taken')
@staticmethod
def _onerror(func, path, exc_info):
""" Error handler for shutil.rmtree.
If the error is due to an access error (read only file)
it attempts to add write permission and then retries.
If the error is for another reason it re-raises the error.
Usage : ``shutil.rmtree(path, onerror=onerror)``
original code by Michael Foord
bug fix suggested by Kun Zhang
"""
import stat
if not os.access(path, os.W_OK):
# Is the error an access error ?
os.chmod(path, stat.S_IWUSR)
func(path)
else:
raise
def log(self, txt):
print('\n' + txt)
def do(self, action, negate=False):
"""finds pattern and approves action on results
"""
func, matcher = self.actions[action]
if not negate:
show = lambda p: p if self.matchers[matcher](p) else None
else:
show = lambda p: p if not self.matchers[matcher](p) else None
results = self.walk(self.path, show)
if results:
question = "%s item(s) found. Apply '%s' to all (y/n/c)? " % (
len(results), func.__doc__.strip())
answer = getch(question)
self.targets = results
if answer in ['y','Y']:
self._apply(func)
elif answer in ['c', 'C']:
self._apply(func, confirm=True)
else:
self.log("Action cancelled.")
else:
self.log("No results.")
def walk(self, path, func, log=True):
"""walk path recursively collecting results of function application
"""
results = []
def visit(root, target, prefix):
for i in target:
item = join(root, i)
obj = func(item)
if obj:
results.append(obj)
self.cum_size += os.path.getsize(obj)
if log:
print(prefix, obj)
for root, dirs, files in os.walk(path):
visit(root, dirs, ' +-->')
visit(root, files,' |-->')
return results
def delete(self, path):
"""delete path
"""
if isfile(path):
os.remove(path)
if isdir(path):
shutil.rmtree(path, onerror=self._onerror)
def clean_endings(self, path):
"""convert windows endings to unix endings
"""
with file(path) as old:
lines = old.readlines()
string = "".join(l.rstrip()+'\n' for l in lines)
with file(path, 'w') as new:
new.write(string)
@classmethod
def cmdline(cls):
usage = """usage: %prog [options] patterns
deletes files/folder patterns:
%prog .svn .pyc
%prog -p /tmp/folder .svn .csv .bzr .pyc
%prog -g "*.pyc"
%prog -gn "*.py"
converts line endings from windows to unix:
%prog -e .py
%prog -e -p /tmp/folder .py"""
parser = OptionParser(usage)
parser.add_option("-p", "--path",
dest="path", help="set path")
parser.add_option("-n", "--negated",
action="store_true", dest="negated",
help="clean everything except specified patterns")
parser.add_option("-e", "--endings",
action="store_true", dest="endings",
help="clean line endings")
parser.add_option("-g", "--glob",
action="store_true", dest="glob",
help="clean with glob patterns")
parser.add_option("-v", "--verbose",
action="store_true", dest="verbose")
(options, patterns) = parser.parse_args()
if len(patterns) == 0:
parser.error("incorrect number of arguments")
if not options.path:
options.path = '.'
if options.verbose:
print('options:', options)
print('finding patterns: %s in %s' % (patterns, options.path))
cleaner = cls(options.path, patterns)
# convert line endings from windows to unix
if options.endings and options.negated:
cleaner.do('convert', negate=True)
elif options.endings:
cleaner.do('convert', negate=True)
# glob delete
elif options.negated and options.glob:
cleaner.do('glob_delete', negate=True)
elif options.glob:
cleaner.do('glob_delete')
# endswith delete (default)
elif options.negated:
cleaner.do('endswith_delete', negate=True)
else:
cleaner.do('endswith_delete')
if __name__ == '__main__':
Cleaner.cmdline()
|
works as advertised!
One improvement I could see is to add another confirmation option besides Y or N. clean may find a long list of files to delete, and you want to delete all but one or two. How about adding a C (for Confirm) option so that you are prompted to confirm the deletion of each file in the result?
For bonus marks, how about a negation -n option? clean -n .py would delete everything BUT the .py files
thanks S
Very nice, it was very useful for cleaning all my LaTeX boilerplate files: .aux .log .dvi and .out files.
Glad you found it useful. Also thanks for the suggestions. I've added the 'c' confirm option and the -n option as well... also code structure is a little cleaner (-:
As a side-note, I also created one version with fnmatch (glob-like) matching and also regex matching, but subsequently dropped it because that just added needless complexity and I also found quoting regexes off the command line to be somewhat counter-intuitive.
AK
A newer version with getch (single key confirmation) functionality coming courtesy of http://code.activestate.com/recipes/134892/
AK
I've heard about using '_svn' instead of '.svn' directories with some IDE (MS Visual studio?).
I'm not aware of this, but _svn should work as well. Just do 'clean _svn'
New version with glob pattern matching just uploaded.
Very nice, but it will throw exception on Windows, please refering http://trac.pythonpaste.org/pythonpaste/ticket/359, add a onerror handle function and change line #164 to resolve the problem.
164: shutil.rmtree(path, onerror=onerror)
def onerror(func, path, exc_info): """ Error handler for
shutil.rmtree
.Thanks to Kun for the bug report about Windows and suggested fix which I've included in the latest version.
New features:
python 2.6+ and 3.+ compatible
added report on cumulative size of files involved in cleaning operations