regression in distToNearest()

Issue #126 resolved
Peter Blazso created an issue
library(shazam)
seqdb <- read.delim("example.tab")
dist <- distToNearest(seqdb, model="ham", first=FALSE, normalize="len" )

The above code worked flawlessly with shazam version 0.1.11, however with 0.2.1 it aborts and throws the following error:

Error in groupGenes(db, v_call = vCallColumn, j_call = jCallColumn, junc_len = NULL, :
one or more of { V_CALL, J_CALL } is factor. Must be character.

If this is intended, please highlight this and the correct way of doing things in the documentation, as well!

Best regards,

Peter

Comments (6)

  1. Julian Zhou

    Thanks for reporting this.

    The reason this error gets raised now is because going from v0.2.11 to v0.3.0, additional pre-checks were added to alakazam::groupGenes, which shazam::distToNearest calls, and which checks if the V_CALL and J_CALL columns in the input db are character instead of factor.

    With seqdb <- read.delim("example.tab"):

    > class(seqdb$V_CALL)
    [1] "factor"
    

    Instead, using seqdb <- read.delim("~/Desktop/example.tab", stringsAsFactors=F):

    > class(db$V_CALL)
    [1] "character"
    

    In short:

    • For now, adding stringsAsFactors=F when reading in the input db should solve the problem
    • For future, we will handle such conversion internally within shazam::distToNearest, and will add a note of the conversion in the doc. This will appear in the next release.
  2. Log in to comment