Welcome, guest | Sign In | My Account | Store | Cart

This recipe shows how to execute a unix shell command and capture the output and error streams in python. By contrast, os.system() sends both streams directly to the shell.

Python, 45 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import os, popen2, fcntl, FCNTL, select

def makeNonBlocking(fd):
    fl = fcntl.fcntl(fd, FCNTL.F_GETFL)
    try:
	fcntl.fcntl(fd, FCNTL.F_SETFL, fl | FCNTL.O_NDELAY)
    except AttributeError:
	fcntl.fcntl(fd, FCNTL.F_SETFL, fl | FCNTL.FNDELAY)
    

def getCommandOutput(command):
    child = popen2.Popen3(command, 1) # capture stdout and stderr from command
    child.tochild.close()             # don't need to talk to child
    outfile = child.fromchild 
    outfd = outfile.fileno()
    errfile = child.childerr
    errfd = errfile.fileno()
    makeNonBlocking(outfd)            # don't deadlock!
    makeNonBlocking(errfd)
    outdata = errdata = ''
    outeof = erreof = 0
    while 1:
	ready = select.select([outfd,errfd],[],[]) # wait for input
	if outfd in ready[0]:
	    outchunk = outfile.read()
	    if outchunk == '': outeof = 1
	    outdata = outdata + outchunk
	if errfd in ready[0]:
	    errchunk = errfile.read()
	    if errchunk == '': erreof = 1
	    errdata = errdata + errchunk
	if outeof and erreof: break
	select.select([],[],[],.1) # give a little time for buffers to fill
    err = child.wait()
    if err != 0: 
	raise RuntimeError, '%s failed w/ exit code %d\n%s' % (command, err, errdata)
    return outdata

def getCommandOutput2(command):
    child = os.popen(command)
    data = child.read()
    err = child.close()
    if err:
	raise RuntimeError, '%s failed w/ exit code %d' % (command, err)
    return data

The presented getCommandOutput(command) function will execute a command and will return the command's output. If the command fails, an exception will be raised with the text captured from the command's stderr.

Most of complexity of this code is due to the difficulty of capturing both the output and error streams of the child process at the same time. Normal (blocking) read calls may deadlock if the child is trying to write to one stream and the parent is waiting for data on the other stream, so the streams must be set to non-blocking and select() must be used to wait for data on the streams.

Note: the second select call adds a 0.1 second sleep after each read. This (counterintuitively) allows the code to run much faster since it gives the child time to put more data in the buffer. Without this, the parent may try to read only a few chars at a time which can be very expensive.

If you only want to capture the output and don't mind the error stream going to the terminal, you can use the much simpler code presented in getCommandOutput2. If you want to suppress the error stream altogether, you can append '2>/dev/null' to the command (e.g. 'ls -1 2>/dev/null').

Note: Python (as of version 2.0) now includes the os.popen4 function which combines the output and error streams of the child process. However, the streams are combined in a potentially very messy way depending on how they are buffered in the child process.

7 comments

Bradey Honsinger 20 years, 1 month ago  # | flag

Empty Select Necessary? Wouldn't a call to time.sleep(0.1) work as well as the select([],[],[],.1) call? I might be missing something, but it seems like the only thing a call to select with three empty lists will do is sleep until the timeout.

Tobias Polzin 19 years, 3 months ago  # | flag

Concatenating Output (Little) Inefficent. The concatenation of the program's output can be done in a more efficent way:

    outdata = []
[...]
      outdata.append(outchunk)
[...]
    return string.join(outdata,"")

With outdata=outdata+outchunk, the data is copyed over and over. Probably not really problematic, but at least in theory it may lead to a quadratic running time. (The same is true of course for errdata...)

Tobias Polzin 19 years, 3 months ago  # | flag

Differnet Approach: Using Tempfiles. I found the approach using tempfiles better:

class save_popen2:
    """This is a deadlock save version of popen2 (no stdin), that returns
    an object with errorlevel,out, and err"""
    def __init__(self,command):
        outfile=tempfile.mktemp()
        errfile=tempfile.mktemp()
        self.errorlevel=os.system("( %s ) > %s 2> %s" %
                             (command,outfile,errfile)) >> 8
        self.out=open(outfile,"r").read()
        self.err=open(errfile,"r").read()
        os.remove(outfile)
        os.remove(errfile)
Jonathan Cano 17 years, 8 months ago  # | flag

cygwin and FCNTL.FNDELAY. regarding this code:

def makeNonBlocking(fd):
    fl = fcntl.fcntl(fd, FCNTL.F_GETFL)
    try:
        fcntl.fcntl(fd, FCNTL.F_SETFL, fl | FCNTL.O_NDELAY)
    except AttributeError:
        fcntl.fcntl(fd, FCNTL.F_SETFL, fl | FCNTL.FNDELAY)

I don't see the .FNDELAY object defined in either the "fcnt" module or the "os" module.

What should I do here to adapt the recipe code to cygwin?

I'm using cygwin on windows 2000 and windows XP.

 (6:0) $ python
Python 2.2.3 (#1, Jun  8 2003, 14:58:23)
[GCC 3.2 20020927 (prerelease)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

~

 (7:0) $ bash --version
GNU bash, version 2.05b.0(9)-release (i686-pc-cygwin)
Copyright (C) 2002 Free Software Foundation, Inc.
Pádraig Brady 17 years, 5 months ago  # | flag

major performance bug.

I based my program on getCommandOutput for a while
but the performance just got worse with newer versions
of linux (different implementations of fork I suppose).
Anyway I had a closer look and getCommandOutput is
needlessly looping over the descriptors. I.E. you
need to remove outfd or errfd from the select if
outeof==1 or erreof==1 respectively. Then you can
remove the select(....,.1) kludge.

I've done my own extensively tested version here,
that has the added functionality of an optional timeout:

http://www.pixelbeat.org/libs/subProcess.py
Donovan Baarda 16 years, 5 months ago  # | flag

Several problems with this recipe. There are several problems with this recipe... I know, I've made them all :-) I even wrote up a draft PEP addressing what I saw as Python problems which I abandoned after I understood things better. See the following thread;

http://mail.python.org/pipermail/python-dev/2005-March/052263.html

The problems are;

1) using file objects in non-blocking mode is bad. File object's behaviour is not clearly defined in non-blocking mode. Sure file.read() kinda works, but file.write() is broken, so you would hit serious problems with this approach if you had to also write data to the popened process. You should use os.read() and os.write() instead.

2) you don't need non-blocking mode. The os.read() and os.write() methods will do incomplete reads/writes in blocking mode. They will only block if nothing at all could be read/written. However, you should never attempt to read/write to a blocked file, because the select() tells you what files can at least be partialy read/written and hence will not block.

3) As pointed out by Pádraig Brady, it is bad to include "finished" files in the select() lists. When os.read() returns nothing, you have reached the end of the input, so close it and remove it from the select() list for any further iterations. The same goes for output files; close and remove them when you have written all your data. Otherwise your select statement imediately completes returning the empty file, and your polling loop races around doing nothing until one of the other files also has something.

4) That second select is ugly. As Bradey Honsinger said, a sleep would have been better to achieve what you wanted here. However, I suspect that the main reason you saw speedups by adding this is that you reduce the "loop racing" explained by point 3).

This recipe needs to be re-written...

Øyvind Hvamstad 16 years, 4 months ago  # | flag

Windows equivalent? I tried a slightly modified version of this recipie. And it works well on linux, however I have a hard time making it work on windows. The main problem is that windows doesn't allow select on files. I only do this for strout and stderr, and these are pipes, right? In linux pipes are just exposed as files, but they are really sockets. Anyone know how to make it work on windows?