inferNovelAlleles multiprocessing fails

Issue #11 resolved
Jason Vander Heiden created an issue

Known issue. Just putting this here for people to see in case they check:

It's not consistently reproduceable on a small data set, but if you repeat the exact same commands a few times it'll usually pop up. You will get one of two errors:

1. Error in unserialize(socklist[[n]]) : error reading from connection
2. "Evaluation error: 'translateCharUTF8' must be called on a CHARSXP."

It consistently fails on a large data set a user sent, but the data set is huge and it takes a long time to run. Here's the commands to replicate:

library(tigger)
library(alakazam)
ighv <- readIgFasta("imgt_human_IGHV.fasta")
db <- readChangeoDb("partA_db-pass_parse-select.tab.gz")
novel_df <- findNovelAlleles(db, ighv, nproc=7)

Test files are on the Dropbox share at: Share/Temporary/madhu_tmp.

I was testing on farnam using R 3.4.1.

I made some minor code changes that didn't seem to help. The next steps I would take if I was to continue debugging would be to split most of the foreach loop into a separate private function, and then further divide that into smaller private functions until I found the block where it fails. (Adding an outfile to the makeCluster call to debug.) I also might trying using a different foreach backend.

It's also possible it's running out of memory, but I tried with 32GB allocated so that seems unlikely.

My guess is that the Rcpp error is being issued from some tidyverse function, and that the type mismatch is due to missing data somehow. The usual suspects for that sort of thing being a missing simplify=TRUE in an sapply(), or needing to index a data.frame with [[columns]] instead of [columns] because the data.frame got converted to a tibble at some point, or a regex that isn't catching every case it should (like a weird allele name).

Comments (10)

  1. ssnn

    To add more fun. I got a different error:

    novel_df <- findNovelAlleles(db, ighv, nproc=7)
    Error in serialize(data, node$con) : error writing to connection
    
  2. Jason Vander Heiden reporter

    Will do.

    Also, looks like someone doesn't have the EOL Extension installed.

    I think I'm going to switch the .hgeol file to LF (unix style) to avoid this in the future.

  3. Jason Vander Heiden reporter

    Well, it could've been me for all I know... So long as @dgadala has the extension installed we should be okay in the future, but we might have a few "full replacement" commits.

  4. Jason Vander Heiden reporter

    I seems to be working fine now for me as well. I tested with the 2017-09-12 singularity image on farnam.

  5. Log in to comment