Swap plyr for dplyr

Issue #34 closed
Jason Vander Heiden created an issue

Both alakazam and tigger have been moved to dplyr. This will surely cause problems unless we either ensure all plyr functions are called using package references (plyr::) or we replace the plyr calls with corresponding dplyr calls.

Note the message upon load of shm:

You have loaded plyr after dplyr - this is likely to cause problems.
If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
library(plyr); library(dplyr)

We should also swap reshape2 for tidyr.

Alternatively, we could swap the plyr/reshape2 functions for data.table equivalents. I have no preference.

Comments (9)

  1. Jason Vander Heiden reporter
    • changed status to open

    I think we need to go some type casting to remove warnings. I'm getting these from calcBaseline() and groupBaseline():

    Warning messages:
    1: In rbind_all(x, .id) : Unequal factor levels: coercing to character
    

    Also, from the bind_rows man page:

    rbind_list and rbind_all have been deprecated. Instead use bind_rows
    
  2. ssnn

    I don't get this warning with the example code for calcBaseline. Are you using dplyr_0.4.3.?

    library("alakazam")
    library("shm")
    dbPath <- system.file("extdata", "Influenza.tab", package="shm")
    db <- readChangeoDb(dbPath)
    db_baseline <- calcBaseline(db,
                                sequenceColumn="SEQUENCE_IMGT",
                                germlineColumn="GERMLINE_IMGT_D_MASK",
                                testStatistic="focused",
                                regionDefinition=IMGT_V_NO_CDR3,
                                targetingModel = HS5FModel,
                                nproc = 6)
    

    But I do get the warning with groupBaseline, and the source seems to be summarizeBaseline(baseline):

    dplyr::bind_rows(df_baseline_seq, df_baseline_seq_region)
    

    when binding CDR and FWR regions. I have to investigate.

    Is this a bug in dplyr? bind_rows calls rbind_all, which is deprecated:

    bind_rows
    function (..., .id = NULL) 
    {
        dots <- list(...)
        if (is.list(dots[[1]]) && !is.data.frame(dots[[1]]) && !length(dots[-1])) {
            x <- dots[[1]]
        }
        else {
            x <- dots
        }
        if (!is.null(.id)) {
            if (!(is.character(.id) && length(.id) == 1)) {
                stop(".id is not a string", call. = FALSE)
            }
            names(x) <- names(x) %||% seq_along(x)
        }
        rbind_all(x, .id)
    }
    <environment: namespace:dplyr>
    
  3. Log in to comment