Read quality = 0 stops FilterSeq
Hi,
I’m using FilterSeq on some very bad quality .fastq files, and when reads are fully failed (all N), it stops working since the quality of the whole read is 0.
Example read:
@SN863:625:H5M7YBCX3:1:1101:1036:5108 2:N:0:CGATGTTTATCT
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
+
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Error:
$ python3.7 presto-0.5.13/bin/FilterSeq.py quality --inner -q 25 --failed --outdir ./data/fastq_trimmed/ -s ./data/fastq_raw/4256_A_run624_CGATGTTTGGGG_S4_L001_R2_001.fastq
START> FilterSeq
COMMAND> quality
FILE> 4256_A_run624_CGATGTTTGGGG_S4_L001_R2_001.fastq
INNER> True
MIN_QUAL> 25.0
NPROC> 12
PROGRESS> 11:47:42 | | 0% ( 0) 0.0 minPID 92134> Error in sibling process detected. Cleaning up.
ERROR> Error processing sequence with ID: SN863:625:H5M7YBCX3:1:1101:1036:5108.
PID 92121> Error in sibling process detected. Cleaning up.
Process Process-8:
Traceback (most recent call last):
File "/Users/CMonzo/.conda/envs/MPI/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/Users/CMonzo/.conda/envs/MPI/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/Users/CMonzo/.conda/envs/MPI/lib/python3.7/site-packages/presto/Multiprocessing.py", line 402, in processSeqQueue
result = process_func(data, **process_args)
File "/Users/CMonzo/.conda/envs/MPI/lib/python3.7/site-packages/presto/Sequence.py", line 1289, in filterQuality
q = sum(quals) / len(quals)
ZeroDivisionError: division by zero
Comments (4)
-
-
-
assigned issue to
-
assigned issue to
-
-
assigned issue to
-
assigned issue to
-
- changed status to resolved
Done in 315c82f.
- Log in to comment
Thanks for reporting this. We’ll take a look. This looks easy to fix.
Until we post a fix, I suspect you can get these files to run through by first running them through FilterSeq.py missing to remove everything with a lot (all) Ns.