MaskPrimers throws "Error in sibling process detected" errors during Illumina MiSeq 2x250 BCR mRNA example

Issue #41 resolved
Rian Stockbower created an issue

On line 4. Using Windows. Python 3.5.1. Here's the trace:

M:\dump\presto
> MaskPrimers.py score -s M1_quality-pass.fastq -p VPrimers.fasta --start 4 --mode mask --outname M1-FWD --log MPV.log
      START> MaskPrimers
    COMMAND> score
   SEQ_FILE> M1_quality-pass.fastq
PRIMER_FILE> VPrimers.fasta
       MODE> mask
    BARCODE> False
  MAX_ERROR> 0.2
  START_POS> 4
 REV_PRIMER> False
      NPROC> 4

Error processing sequence with ID: ERR346600.1.
PID 10020:  Error in sibling process detected. Cleaning up.
Process Process-2:
Traceback (most recent call last):
  File "c:\program files\python35\lib\multiprocessing\process.py", line 254, in _bootstrap
    self.run()
  File "c:\program files\python35\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Program Files\Python35\Scripts\MaskPrimers.py", line 384, in processMPQueue
    align = align_func(in_seq, **align_args)
  File "C:\Program Files\Python35\Scripts\MaskPrimers.py", line 244, in scorePrimers
    score = sum([score_dict[(c1, c2)] for c1, c2 in chars])
  File "C:\Program Files\Python35\Scripts\MaskPrimers.py", line 244, in <listcomp>
    score = sum([score_dict[(c1, c2)] for c1, c2 in chars])
KeyError: ('T', '\xc2')
PID 8072:  Error in sibling process detected. Cleaning up.
PID 3392:  Error in sibling process detected. Cleaning up.
PID 7416:  Error in sibling process detected. Cleaning up.
PID 10960:  Error in sibling process detected. Cleaning up.

The first two steps went fine.

Python doesn't cleanly exit after this; I have to manually kill the process using process explorer.

Comments (7)

  1. Jason Vander Heiden

    Thanks reporting this, @rianjs. This is due to some junk characters in the example V-primer files, which we had supposedly fixed in issue #40, but it seems we only fixed it on linux systems. Windows still seems to be seeing an \xc2 character somewhere.

    I remade the primer files and posted a new version of the Greiff2014_Example.tar.gz file on the readthedocs site. I ran a test on Windows 7 with Python 3.4.3 and it seems to be behaving now. Let me know if it gives you more trouble.

    Regarding the need to manually kill the processes, it should spit out a message about "terminating child processes" after all processes finish their current task and then kill eveything. If it's not doing that, I need to take a look at it. We have an old issue (#6) about trying to improve the check for exceptions in sibling processes that we need to work on... It might take a little bit of time to get to that though. Sorry for the inconvenience.

  2. Rian Stockbower reporter

    Thanks, Jason. I'll give the new primer files a try in the near future and report back.

    Regarding the need to manually kill the processes, it should spit out a message about "terminating child processes" after all processes finish their current task and then kill everything. If it's not doing that, I need to take a look at it. We have an old issue (#6) about trying to improve the check for exceptions in sibling processes that we need to work on... It might take a little bit of time to get to that though. Sorry for the inconvenience.

    It does indeed say that it's terminating child processes with the offending PIDs (see original post). What's not clear in my OP is that what I've dumped is complete, and not edited in any way. I.e., despite the messages, I haven't been dropped back to the Windows shell prompt. Ctrl-C doesn't work, either. Process explorer indicates that py.exe is still running as a child process of cmd.exe. Killing py.exe brings me back to the Windows prompt the way you'd expect Ctrl-C to.

    I can attach a screenshot of the process ownership if you like. so you can see what I mean. If you need any additional information, lmk, and I'll see if I can paste/attach it here.

  3. Jason Vander Heiden

    What it should do:

    PS C:\Users\VMUser\Downloads\Greiff2014> MaskPrimers.py score -s .\ERR346600_1.fastq -p .\BadPrimers.fasta --start 4
          START> MaskPrimers
        COMMAND> score
       SEQ_FILE> ERR346600_1.fastq
    PRIMER_FILE> BadPrimers.fasta
           MODE> mask
        BARCODE> False
      MAX_ERROR> 0.2
      START_POS> 4
     REV_PRIMER> False
          NPROC> 1
    
    Error processing sequence with ID: ERR346600.1.
    Process Process-2:
    Traceback (most recent call last):
      File "C:\Python34\lib\multiprocessing\process.py", line 254, in _bootstrap
        self.run()
      File "C:\Python34\lib\multiprocessing\process.py", line 93, in run
        self._target(*self._args, **self._kwargs)
      File "C:\Python34\Scripts\MaskPrimers.py", line 384, in processMPQueue
        align = align_func(in_seq, **align_args)
      File "C:\Python34\Scripts\MaskPrimers.py", line 244, in scorePrimers
        score = sum([score_dict[(c1, c2)] for c1, c2 in chars])
      File "C:\Python34\Scripts\MaskPrimers.py", line 244, in <listcomp>
        score = sum([score_dict[(c1, c2)] for c1, c2 in chars])
    KeyError: ('A', '\xc3')
    PID 3720:  Error in sibling process detected. Cleaning up.
    PID 3052:  Error in sibling process detected. Cleaning up.
    ERROR:  Exiting due to child process error
    Terminating child processes...  Done.
    

    The last two lines are what's missing, which is spit out by the function that kills all the processes. That's where it's getting stuck. I can reproduce the locked processes with --nproc 4, in both linux and Windows, so something is screwy with either catching the exception or communicating it to the process than owns the other processes.

    I'll take a look at it. Thanks!

  4. Rian Stockbower reporter

    The new primers file fixed the error. Hooray trailing newlines, I suppose.

    BTW, the very last sample command uses a regex, which doesn't work in the Windows CLI. Not a big deal, but I thought I'd mention it. Windows users have to split it out into two distinct invocations:

    ParseLog.py -l MPV.log -f ID PRIMER ERROR
    ParseLog.py -l MPC.log -f ID PRIMER ERROR
    

    Here's the screenshot of the process ownership of cmd > py.exe > python.exe when the processes bomb out. I don't know if it's helpful or not: python-dies.png

  5. Log in to comment