AssemblePairs get stuck
No way to reproduce this (it doesn’t happen every time, but does happen ~9 out of 10 times in my experience [on Farnam]), but AssemblePairs.py sequential --aligner blastn
tends to get stuck before finishing, anywhere between 5% to 95%. @Kenneth Hoehn suggested that it might be a file system issue with blastn
, with a potential alternative being trying to replace blastn
by something like a Smith-Waterman algorithm implemented in native Python.
Comments (6)
-
-
reporter Oh, there’s actually already an issue (#65) for this! I created this after bringing it up at subgroup meeting and Steve said that we should make a note of it even if it’s hard to reproduce.
I haven’t tried using it with
usearch
.As get-around, I’ve been breaking my input files into chunks and passing them individually to
AssemblePairs
. It still hangs sometimes; or there could be a core dump and when I compare the input # reads vs. output # reads (passed + failed), a discrepancy appears. But this way at least I only have to re-run for one small chunk, as opposed to having to wait for the entire lot to run through again. -
reporter @Hailong Meng mentioned that he experienced a similar issue with
MaskPrimers
(which also callsblastn
?) -
No, MaskPrimers doesn’t call blastn. I do recall having what appears to be disk related issues on farnam though. So, it could be something outside our control.
-
I have the problem with the older version image of presto. Now with the new version, it seems fine with me.
-
reporter - changed status to duplicate
Duplicate of #65.
- Log in to comment
Does it get stuck if you use usearch?
Also, the native SW in python is really slow. There are some C coded Striped Smith-Waterman libraries for Python out there, so that would be a better approach. Last I checked, I couldn’t get them to install, but that was years ago.
The blastn/usearch wrapper setup is pretty inefficient right now, having bigger chunks of sequences (maybe even just 1 chunk) passed into blastn/usearch for reference alignment might help.