distToNearest won't work if V_CALL or J_CALL are factors

Issue #132 resolved
ssnn created an issue

User reported distToNearest didn’t work loading data with read.table. We should fix this issue or add a better error message, if we expect data to be loaded with readChangeoDb or stringsAsFactors=F.

# This works, reading in data, with alakazam::readChangeoDb
> library(alakazam)
> db <- readChangeoDb("sample_db.tab")
> distToNearest(db, vCallColumn="V_CALL", jCallColumn="J_CALL", sequenceColumn="JUNCTION")

# Not using readChangeoDb
# With read.table
# This doesn't work
> db <- read.table("sample_db.tab", sep="\t", header=T)
> distToNearest(db, vCallColumn="V_CALL", jCallColumn="J_CALL", sequenceColumn="JUNCTION")
Error in groupGenes(db, v_call = vCallColumn, j_call = jCallColumn, junc_len = NULL,  :
  one or more of { V_CALL, J_CALL } is factor. Must be character.

# Adding stringsAsFactors=F, it works
> db <- read.table("sample_db.tab", sep="\t", header=T, stringsAsFactors=F)
> distToNearest(db, vCallColumn="V_CALL", jCallColumn="J_CALL", sequenceColumn="JUNCTION")

Comments (5)

  1. Julian Zhou

    The current error message does indicate what the problem is (column is factor, not character).. But okay, I added more details to that, suggesting that stringsAsFactors be set to FALSE if read.table was being used. Commit cf0e7fe in alakazam since the error actually happens in alakazam::groupGenes, which in this case is being called by shazam::distToNearest.

  2. Jason Vander Heiden

    It actually looks like you fixed this in alakazam last year, but the fix was bugged is all. I think it’s better now.

  3. Log in to comment