weird warning message when running `reassignAlleles()`
Issue #14
resolved
Have not had this before for any other subjects. Only this subject. Message not helpful either..
> vCallGeno = reassignAlleles(clip_db=db, genotype_db=genoSeqs, v_call="V_CALL",
+ method="hamming", keep_gene=TRUE)
Warning message:
In V_CALL_GENOTYPED[ind] = sapply(best_alleles, paste, collapse = ",") :
number of items to replace is not a multiple of replacement length
Comments (3)
-
reporter -
reporter Fixed in commit f5439b5
-
reporter - changed status to resolved
- Log in to comment
I'll try to explain what I think happened:
From
reassignAlleles()
:dist_mat
appears to always be a matrix, even when its nrow is 1. So dist_mat is not a problem.The last 3 steps involving
best_match
,best_alleles
, andV_CALL_GENOTYPES[ind]
rely on 2 scenarios.Scenario 1
dist_mat
has nrow >= 1min(dist_mat[i, ])
is a single value for all i, thusbest_match
is a vectorbest_alleles
is a vector, with each entry being a single alleleeach slot in
V_CALL_GENOTYPED[ind]
gets assigned a single entry frombest_alleles
Scenario 2
dist_mat
has nrow > 1min(dist_mat[i, ])
returns multiple values for some i, thereby renderingbest_match
as a listbest_alleles
is a list, with some entries containing a vector of multiple allelessapply(best_alleles, paste, collapse = ",")
works as a de facto lapply and concatenates the multiple alleles in thebest_alleles
entriesHowever, this does not account for a third scenario.
Scenario 3
dist_mat
has nrow = 1min(dist_mat[i, ])
returns multiple values for i=1. In this case, R would coercebest_match
fromsapply
into a single-column, multi-row matrix.best_alleles
becomes a single vector of multiple allelessapply(best_alleles, paste, collapse = ",")
does NOT concatenate the multiple alleles inbest_alleles
together, unlike intended.This creates a situation where
ind
provides a single slot, whereassapply(best_alleles, paste, collapse = ",")
provides multiple values.The roots of the problem lies in that R does not always data structure unmutable, especially in scenarios such as that above where a matrix has row dimension of 1.
A more comprehensive fix would be to switch to unmutable data structures provided by the likes of H Wickham's
tibble
package, alas I'll leave that to future heros/heroins to come.I provide a quick albeit less elegant fix by explicitly specifying a
list
data structure throughout the affected steps, and keeping that data structure as a list by usinglapply
instead ofsapply
orapply
.