Error in {: task 658 failed - "missing value where TRUE/FALSE needed"

Greetings,

‌

Thanks for the highly useful shazam package. I've been using it and encountered an error with collapseClones when running (the repertoire file repertoire_sample.tsv is attached):

repertoire = read.table("repertoire_sample.tsv", sep='\t')
clonal_sequences = shazam::collapseClones(
    db = repertoire[, c("sequence_alignment", "germline_alignment", "clone_id")], 
    cloneColumn="clone_id", 
    sequenceColumn="sequence_alignment", 
    germlineColumn="germline_alignment", 
    regionDefinition=NULL, #shazam::IMGT_V, 
    method="thresholdedFreq", minimumFrequency=0.6,
    includeAmbiguous=FALSE, breakTiesStochastic=FALSE, 
    nproc=12)

‌

The error log is:

Error in {: task 658 failed - "missing value where TRUE/FALSE needed"
Traceback:

1. shazam::collapseClones(db = repertoire[1:5000, c("sequence_alignment", 
 .     "germline_alignment", "clone_id")], cloneColumn = "clone_id", 
 .     sequenceColumn = "sequence_alignment", germlineColumn = "germline_alignment", 
 .     regionDefinition = NULL, method = "thresholdedFreq", minimumFrequency = 0.6, 
 .     includeAmbiguous = FALSE, breakTiesStochastic = FALSE, nproc = 1)
2. foreach(idx = 1:length(uniqueClonesIdx), .verbose = FALSE, .errorhandling = "stop") %dopar% 
 .     {
 .         cloneIdx <- uniqueClonesIdx[[idx]]
 .         cloneDb <- db[cloneIdx, ]
 .         clone_num <- unique(cloneDb[["fields_clone_id"]])
 .         if (length(unique(cloneDb[[juncLengthColumn]])) > 1) {
 .             stop("Expecting all sequences in the same clone with the same junction lenght.")
 .         }
 .         cloneRegionDefinition <- regionDefinition
 .         if (regionDefinitionName %in% c("IMGT_VDJ_BY_REGIONS", 
 .             "IMGT_VDJ")) {
 .             cloneRegionDefinition <- getCloneRegion(clone_num = clone_num, 
 .                 db = cloneDb, seq_col = sequenceColumn, juncLengthColumn = juncLengthColumn, 
 .                 clone_col = "fields_clone_id", regionDefinition = regionDefinition)
 .         }
 .         calcClonalConsensus(db = cloneDb, sequenceColumn = sequenceColumn, 
 .             germlineColumn = germlineColumn, muFreqColumn = muFreqColumn, 
 .             regionDefinition = cloneRegionDefinition, method = method, 
 .             minimumFrequency = minimumFrequency, includeAmbiguous = includeAmbiguous, 
 .             breakTiesStochastic = breakTiesStochastic, breakTiesByColumns = breakTiesByColumns)
 .     }
3. e$fun(obj, substitute(ex), parent.frame(), e$data)

‌

The call doesn’t return an error if I significantly reduce the input db size:

clonal_sequences = shazam::collapseClones(
    db = repertoire[1:10, c("sequence_alignment", "germline_alignment", "clone_id")], 
    cloneColumn="clone_id", 
    sequenceColumn="sequence_alignment", 
    germlineColumn="germline_alignment", 
    regionDefinition=NULL, #shazam::IMGT_V, 
    method="thresholdedFreq", minimumFrequency=0.6,
    includeAmbiguous=FALSE, breakTiesStochastic=FALSE, 
    nproc=12)

‌

Even though I could run it successfully if I reduce the size of repertoire, I would like to understand what is the source of the error. It would also be useful to add an informative error message.

‌

Thank you and looking forward,

Eugen

Comments (2)