MaskPrimers throws "Error in sibling process detected" errors during Illumina MiSeq 2x250 BCR mRNA example
On line 4. Using Windows. Python 3.5.1. Here's the trace:
M:\dump\presto
> MaskPrimers.py score -s M1_quality-pass.fastq -p VPrimers.fasta --start 4 --mode mask --outname M1-FWD --log MPV.log
START> MaskPrimers
COMMAND> score
SEQ_FILE> M1_quality-pass.fastq
PRIMER_FILE> VPrimers.fasta
MODE> mask
BARCODE> False
MAX_ERROR> 0.2
START_POS> 4
REV_PRIMER> False
NPROC> 4
Error processing sequence with ID: ERR346600.1.
PID 10020: Error in sibling process detected. Cleaning up.
Process Process-2:
Traceback (most recent call last):
File "c:\program files\python35\lib\multiprocessing\process.py", line 254, in _bootstrap
self.run()
File "c:\program files\python35\lib\multiprocessing\process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "C:\Program Files\Python35\Scripts\MaskPrimers.py", line 384, in processMPQueue
align = align_func(in_seq, **align_args)
File "C:\Program Files\Python35\Scripts\MaskPrimers.py", line 244, in scorePrimers
score = sum([score_dict[(c1, c2)] for c1, c2 in chars])
File "C:\Program Files\Python35\Scripts\MaskPrimers.py", line 244, in <listcomp>
score = sum([score_dict[(c1, c2)] for c1, c2 in chars])
KeyError: ('T', '\xc2')
PID 8072: Error in sibling process detected. Cleaning up.
PID 3392: Error in sibling process detected. Cleaning up.
PID 7416: Error in sibling process detected. Cleaning up.
PID 10960: Error in sibling process detected. Cleaning up.
The first two steps went fine.
Python doesn't cleanly exit after this; I have to manually kill the process using process explorer.
Comments (7)
-
-
-
assigned issue to
-
assigned issue to
-
reporter Thanks, Jason. I'll give the new primer files a try in the near future and report back.
Regarding the need to manually kill the processes, it should spit out a message about "terminating child processes" after all processes finish their current task and then kill everything. If it's not doing that, I need to take a look at it. We have an old issue (#6) about trying to improve the check for exceptions in sibling processes that we need to work on... It might take a little bit of time to get to that though. Sorry for the inconvenience.
It does indeed say that it's terminating child processes with the offending PIDs (see original post). What's not clear in my OP is that what I've dumped is complete, and not edited in any way. I.e., despite the messages, I haven't been dropped back to the Windows shell prompt. Ctrl-C doesn't work, either. Process explorer indicates that py.exe is still running as a child process of cmd.exe. Killing py.exe brings me back to the Windows prompt the way you'd expect Ctrl-C to.
I can attach a screenshot of the process ownership if you like. so you can see what I mean. If you need any additional information, lmk, and I'll see if I can paste/attach it here.
-
What it should do:
PS C:\Users\VMUser\Downloads\Greiff2014> MaskPrimers.py score -s .\ERR346600_1.fastq -p .\BadPrimers.fasta --start 4 START> MaskPrimers COMMAND> score SEQ_FILE> ERR346600_1.fastq PRIMER_FILE> BadPrimers.fasta MODE> mask BARCODE> False MAX_ERROR> 0.2 START_POS> 4 REV_PRIMER> False NPROC> 1 Error processing sequence with ID: ERR346600.1. Process Process-2: Traceback (most recent call last): File "C:\Python34\lib\multiprocessing\process.py", line 254, in _bootstrap self.run() File "C:\Python34\lib\multiprocessing\process.py", line 93, in run self._target(*self._args, **self._kwargs) File "C:\Python34\Scripts\MaskPrimers.py", line 384, in processMPQueue align = align_func(in_seq, **align_args) File "C:\Python34\Scripts\MaskPrimers.py", line 244, in scorePrimers score = sum([score_dict[(c1, c2)] for c1, c2 in chars]) File "C:\Python34\Scripts\MaskPrimers.py", line 244, in <listcomp> score = sum([score_dict[(c1, c2)] for c1, c2 in chars]) KeyError: ('A', '\xc3') PID 3720: Error in sibling process detected. Cleaning up. PID 3052: Error in sibling process detected. Cleaning up. ERROR: Exiting due to child process error Terminating child processes... Done.
The last two lines are what's missing, which is spit out by the function that kills all the processes. That's where it's getting stuck. I can reproduce the locked processes with
--nproc 4
, in both linux and Windows, so something is screwy with either catching the exception or communicating it to the process than owns the other processes.I'll take a look at it. Thanks!
-
reporter The new primers file fixed the error. Hooray trailing newlines, I suppose.
BTW, the very last sample command uses a regex, which doesn't work in the Windows CLI. Not a big deal, but I thought I'd mention it. Windows users have to split it out into two distinct invocations:
ParseLog.py -l MPV.log -f ID PRIMER ERROR ParseLog.py -l MPC.log -f ID PRIMER ERROR
Here's the screenshot of the process ownership of cmd > py.exe > python.exe when the processes bomb out. I don't know if it's helpful or not: python-dies.png
-
Thanks. I've modified all the example workflows to be more Windows friendly.
-
- changed status to resolved
Primary issue resolved. Exception checking rework staying open in #6.
- Log in to comment
Thanks reporting this, @rianjs. This is due to some junk characters in the example V-primer files, which we had supposedly fixed in issue
#40, but it seems we only fixed it on linux systems. Windows still seems to be seeing an\xc2
character somewhere.I remade the primer files and posted a new version of the Greiff2014_Example.tar.gz file on the readthedocs site. I ran a test on Windows 7 with Python 3.4.3 and it seems to be behaving now. Let me know if it gives you more trouble.
Regarding the need to manually kill the processes, it should spit out a message about "terminating child processes" after all processes finish their current task and then kill eveything. If it's not doing that, I need to take a look at it. We have an old issue (#6) about trying to improve the check for exceptions in sibling processes that we need to work on... It might take a little bit of time to get to that though. Sorry for the inconvenience.