Unstable/inconsistent behaviour of progress.py example code

Issue #39 resolved
CDuv created an issue

In want to use the sarge library to execute a simple cmd1 | cmd2 shell command from my Python script an grab it's output (both stdout and stderr) live while the shell command executes. I was previously using subprocess.Popen() + subprocess.PIPE + communicate() but the output was buffered.

As a working base I tested the progress.py code detailed in the tutorial.

File "test_sarge_live.py":

import optparse # because of 2.6 support
import sys
import threading
import time
import logging

from sarge import capture_stdout

logging.basicConfig(level=logging.DEBUG, filename='/tmp/test_sarge_live.log',
                    filemode='w', format='%(asctime)s %(threadName)-10s %(name)-15s %(lineno)4d %(message)s')

def progress(capture, options):
    lines_seen = 0
    messages = {
        'line 25\n': 'Getting going ...\n',
        'line 50\n': 'Well on the way ...\n',
        'line 75\n': 'Almost there ...\n',
    }
    while True:
        s = capture.readline()
        if not s and lines_seen:
            break
        if options.dots:
            sys.stderr.write('.')
        else:
            msg = messages.get(s)
            if msg:
                sys.stderr.write(msg)
        lines_seen += 1
    if options.dots:
        sys.stderr.write('\n')
    sys.stderr.write('Done - %d lines seen.\n' % lines_seen)

def main():
    parser = optparse.OptionParser()
    parser.add_option('-n', '--no-dots', dest='dots', default=True,
                      action='store_false', help='Show dots for progress')
    options, args = parser.parse_args()

    #~ p = capture_stdout('ncat -k -l -p 42421', async=True)
    p = capture_stdout('python lister.py -d 0.1 -c 100', async=True)

    t = threading.Thread(target=progress, args=(p.stdout, options))
    t.start()

    while(p.returncodes[0] is None):
        # We could do other useful work here. If we have no useful
        # work to do here, we can call readline() and process it
        # directly in this loop, instead of creating a thread to do it in.
        p.commands[0].poll()
        time.sleep(0.05)
    t.join()

if __name__ == '__main__':
    sys.exit(main())

But running it gives very inconsistent output:

On Ubuntu v16.04 using Python v2.7.12 and sarge v0.1.4 I get theses (at random):

  • This ('NoneType' object has no attribute 'returncode' + sys.excepthook is missing)
Traceback (most recent call last):
File "test_sarge_live.py", line 55, in <module>
  sys.exit(main())
File "test_sarge_live.py", line 46, in main
  while(p.returncodes[0] is None):
File "/usr/local/lib/python2.7/dist-packages/sarge/__init__.py", line 1072, in returncodes
  result = [c.process.returncode for c in self.commands]
AttributeError: 'NoneType' object has no attribute 'returncode'
..
Done - 2 lines seen.
close failed in file object destructor:                                                                                                                                                                                                                         
sys.excepthook is missing
lost sys.stderr
  • Or also this (no 'NoneType' object has no attribute 'returncode')
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
  • Or even this:
Traceback (most recent call last):
.  File "test_sarge_live.py", line 55, in <module>

sys.exit(main())Done - 1 lines seen.

File "test_sarge_live.py", line 46, in main
  while(p.returncodes[0] is None):
IndexError: list index out of range
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
File "/usr/lib/python2.7/threading.py", line 754, in run
File "/usr/local/lib/python2.7/dist-packages/sarge/__init__.py", line 1136, in run_node
File "/usr/local/lib/python2.7/dist-packages/sarge/__init__.py", line 1282, in run_command_node
File "/usr/local/lib/python2.7/dist-packages/sarge/__init__.py", line 656, in run
File "/usr/lib/python2.7/threading.py", line 585, in set
File "/usr/lib/python2.7/threading.py", line 407, in notifyAll
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
  • And sometimes this no failure output:
.
Done - 1 lines seen.

I also tested on Docker container Debian v9.4 using Python v2.7.13 and sarge v0.1.4: I get the same outputs.

When I kill (via [Ctrl]+[C]) one instance and immediately run the script again I get:

.Traceback (most recent call last):
  File "test_sarge_live.py", line 55, in <module>
    sys.exit(main())
  File "test_sarge_live.py", line 46, in main
    while(p.returncodes[0] is None):
IndexError: list index out of range

Done - 1 lines seen.
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
  File "/usr/lib/python2.7/threading.py", line 754, in run
  File "/usr/local/lib/python2.7/dist-packages/sarge/__init__.py", line 1136, in run_node
  File "/usr/local/lib/python2.7/dist-packages/sarge/__init__.py", line 1282, in run_command_node
  File "/usr/local/lib/python2.7/dist-packages/sarge/__init__.py", line 656, in run
  File "/usr/lib/python2.7/threading.py", line 585, in set
  File "/usr/lib/python2.7/threading.py", line 407, in notifyAll
<type 'exceptions.TypeError'>: 'NoneType' object is not callable
root@1ad9fddcff28:/app# close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr

but I think this is normal.

Is the tutorial code still accurate?

Comments (8)

  1. Vinay Sajip repo owner

    No, I think the tutorial example is flawed. The problem is caused by a race condition - if you insert e.g. a time.sleep(0.01) before the t = threading.Thread(target=progress, args=(p.stdout, options)) line, it should work as expected. I'll update the tutorial when I get a chance. Try it, and report back if it still doesn't behave.

  2. CDuv reporter

    Adding time.sleep(0.01) before t = threading.Thread(target=progress, args=(p.stdout, options)) made the script behave "consistently" and without any error. Thanks :)

  3. CDuv reporter

    Well, the behaviour is consistent, it always and only outputs:

    .
    Done - 1 lines seen.
    

    There are missing lines (because python lister.py -d 0.1 -c 100 does indeed prints 100 lines). It looks like progress.py does not catch the full stdout of lister.py.

  4. Vinay Sajip repo owner

    Try changing that time.sleep value, and have a look at the log. Perhaps the timeouts need tweaking. On my machine, after I made the change, the test script ran and showed the 100 dots.

  5. CDuv reporter

    I noticed the python progress.py execution time is always ~10s just like python lister.py -d 0.1 -c 100.

    Sleeping 0.1s gives me 2 dots/lines (but not always), sleeping 5s gives me 50 dots/lines, sleeping 10s or more gives me my 100 dots/lines.

    Log file does contains queued chunk of length X: 'line Y\n':

    13:29:20,203 MainThread sarge.parse  810 starting parse of 'python lister.py -d 0.1 -c 100'
    13:29:20,203 MainThread sarge.parse  952 returning CommandNode(command=['python'] redirects={})
    13:29:20,203 MainThread sarge.parse  952 returning CommandNode(command=['lister.py'] redirects={})
    13:29:20,203 MainThread sarge.parse  952 returning CommandNode(command=['-d'] redirects={})
    13:29:20,203 MainThread sarge.parse  952 returning CommandNode(command=['0.1'] redirects={})
    13:29:20,203 MainThread sarge.parse  952 returning CommandNode(command=['-c'] redirects={})
    13:29:20,203 MainThread sarge.parse  952 returning CommandNode(command=['100'] redirects={})
    13:29:20,203 MainThread sarge.parse  895 returning CommandNode(command=['python', 'lister.py', '-d', '0.1', '-c', '100'] redirects={})
    13:29:20,203 MainThread sarge.parse  878 returning CommandNode(command=['python', 'lister.py', '-d', '0.1', '-c', '100'] redirects={})
    13:29:20,203 MainThread sarge.parse  849 returning CommandNode(command=['python', 'lister.py', '-d', '0.1', '-c', '100'] redirects={})
    13:29:20,204 MainThread sarge.parse  833 returning CommandNode(command=['python', 'lister.py', '-d', '0.1', '-c', '100'] redirects={})
    13:29:20,219 MainThread sarge       1022 thread Thread-1 started to run node: CommandNode(command=['python', 'lister.py', '-d', '0.1', '-c', '100'] redirects={})
    13:29:20,220 Thread-1   sarge       1266 started: CommandNode(command=['python', 'lister.py', '-d', '0.1', '-c', '100'] redirects={}), None, True
    13:29:20,220 Thread-1   sarge        595 Command('python lister.py -d 0.1 -c 100') created
    13:29:20,220 Thread-1   sarge        631 About to call Popen: ['python', 'lister.py', '-d', '0.1', '-c', '100'], {'stdin': None, 'stderr': None, 'stdout': -1}
    13:29:20,221 Thread-1   sarge        634 Popen: Command('python lister.py -d 0.1 -c 100'), {'stdin': None, 'stderr': None, 'stdout': -1} -> {'_child_created': True, 'returncode': None, 'stdout': <open file '<fdopen>', mode 'rb' at 0x7f0a1684f810>, 'stdin': None, 'pid': 746, 'stderr': None, 'universal_newlines': False}
    13:29:20,222 Thread-1   sarge        212 Created thread Thread-2 as reader for Capture-1
    13:29:20,240 Thread-1   sarge        216 Capture-1: reader thread kicked off, waiting start
    13:29:20,240 Thread-1   sarge        218 Capture-1: reader thread now started
    13:29:20,240 Thread-1   sarge        657 returning Command('python lister.py -d 0.1 -c 100') (Popen(returncode=None stdin=None stdout=<open file '<fdopen>', mode 'rb' at 0x7f0a1684f810> stderr=None))
    13:29:20,240 Thread-2   sarge        234 Capture-1: reader thread about to read line
    13:29:20,247 Thread-2   sarge        244 queued chunk of length 7: 'line 1\n'
    13:29:20,347 Thread-2   sarge        244 queued chunk of length 7: 'line 2\n'
    [...]                               
    13:29:30,072 Thread-2   sarge        244 queued chunk of length 8: 'line 99\n'
    13:29:30,175 Thread-2   sarge        244 queued chunk of length 9: 'line 100\n'
    13:29:30,176 Thread-2   sarge        253 Capture-1: finished reading stream <open file '<fdopen>', mode 'rb' at 0x7f0a1684f810>
    

    Very weird...

  6. Vinay Sajip repo owner

    Weird indeed. I've now run the test_progress.py script on Windows, OS X and Linux (Ubuntu 16.04 x64), with no unexpected results on Python 2.x and 3.x except for buffering on Python 3.x (resolved with a call to sys.stderr.flush(). I presume you've tried with the latest repo version?

  7. CDuv reporter

    The progress.py I was using was taken from the documentation, and a little outdated

    I got it working just fine by adding the timeout=1.0 keyword argument to capture.readline() (as the latest test_progress.py has), still using v0.1.4 of the library.

    (Now, back to my main goal...)

    Thanks.

  8. Log in to comment