Real-time intercepting of stdout from another process in Python


Question

I'd like to run a system process, intercept the output, and modify it real-time, line by line, in a Python script.

My best attempt, which waits for the process to complete before printing, is:

#!/usr/bin/env python
import subprocess

cmd = "waitsome.py"
proc = subprocess.Popen(cmd, shell=True, bufsize=256, stdout=subprocess.PIPE)
for line in proc.stdout:
    print ">>> " + line.rstrip()

The script waitsome.py simply prints a line every half a second:

#!/usr/bin/env python
import time
from sys import stdout

print "Starting"
for i in range(0,20):
    time.sleep(0.5)
    print "Hello, iteration", i
    stdout.flush()

Is there an easy solution to get subprocess to allow iterating over the output in real time? Do I have to use threads?

Once upon a time, I scripted in Perl, and this was a piece of cake:

open(CMD, "waitsome.py |");
while (<CMD>) {
    print ">>> $_";
}
close(CMD);
1
4
7/5/2009 11:31:08 PM

Accepted Answer

Looping over a file unavoidably buffers things in pretty large chunks -- a known issue with all Python 2.* implementations. It works as you intend in Python 3.1, with the final loop being slightly different:

for line in proc.stdout:
    print(">>> " + str(line.rstrip()))

If upgrading to Python 3.1 is impractical (and I know it will often be!), go the other way and write the loop in an old-fashioned manner -- the following version of the loop does work as you intend in Python 2.*:

while True:
    line = proc.stdout.readline()
    if not line:
        break
    print ">>> " + line.rstrip()
14
7/6/2009 1:51:07 AM

This whole thing can be encapsulated in an iterator as:

def subprocess_readlines(out):
    while True:
        line = out.readline()
        if not line:
            return
        yield line

And called as:

for line in proc.stdout:
    print ">>>", line.rstrip()

Licensed under: CC-BY-SA with attribution
Not affiliated with: Stack Overflow
Icon