This utility was born from the fact that I keep forgetting how to use "sed", and I suck at Perl. It brings ad-hoc command-line piping sensibilities to the Python interpeter. (Version 1.2 does better outputting of list-like results, thanks to Mark Eichin.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
#!/usr/bin/env python # updated 2005.07.21, thanks to Jacob Oscarson # updated 2006.03.30, thanks to Mark Eichin import sys import re import getopt # parse options for module imports opts, args = getopt.getopt(sys.argv[1:], 'm:') opts = dict(opts) if '-m' in opts: for imp in opts['-m'].split(','): locals()[imp] = __import__(imp.strip()) cmd = ' '.join(args) if not cmd.strip(): cmd = 'line' # no-op codeobj = compile(cmd, 'command', 'eval') write = sys.stdout.write for numz, line in enumerate(sys.stdin): line = line[:-1] num = numz + 1 words = [w for w in line.strip().split(' ') if len(w)] result = eval(codeobj, globals(), locals()) if result is None or result is False: continue elif isinstance(result, list) or isinstance(result, tuple): result = ' '.join(map(str, result)) else: result = str(result) write(result) if not result.endswith('\n'): write('\n')
Save the script as 'pyline' somewhere on your path, e.g. /usr/local/bin/pyline, and make it executable (e.g. chmod +x /usr/local/bin/pyline).
When working at the command line, it's very useful to pipe multiple commands together. Common tools used in pipes include 'head' (show the top lines of a file), 'tail' (show the bottom lines), 'grep' (search the text for a pattern), 'sed' (reformat the text), etc. However, Python is found lacking in this regard, because it's hard to write the kind of one-liner that works well in an ad-hoc pipe statement.
Pyline tries to solve this problem. Use pyline to apply a Python expression to every line of standard input, and return a value to be sent to standard output. The expression can use any installed Python modules. In the context of the expression, the variable "line" holds the string value of the line; "words" is a list of all the non-empty, space-separated words; and "num" is the line number (starting with 1).
Here are a couple examples:
Print out the first 20 characters of every line in the tail of my Apache access log:
tail access_log | pyline "line[:20]"
Print just the URLs in the access log (the seventh "word" in the line):
tail access_log | pyline "words"
Here's a tricker one, showing how to do an import. List the current directory, showing only files that are larger than 1 kilobyte:
ls | pyline -m os "os.path.isfile(line) and os.stat(line).st_size > 1024 and line"
I didn't say it was pretty. ;-) The "-m a,b,c" option will import modules a, b and c for use in the subsequent expression. The "isfile and stat and line" form shows how to do filtering: if an expression returns a False or None value, then no line is sent to stdout.
This last tricky example re-implements the 'md5sum' command, to return the MD5 digest values of all the .py files in the current directory.
ls *.py | pyline -m md5 "'%s %s' % (md5.new(file(line).read()).hexdigest(), line)"
Hopefully you get the idea. I've found it to be an invaluable addition to my command-line toolkit.
Windows users: it works under Windows, but name it "pyline.py" instead of "pyline", and call it via a batch file so that the piping works properly.