QC_EdgeR read.table() error with check.names=TRUE

Issue #94 resolved
Guillaume NOELL created an issue

Hello,

while using the QC_EdgeR() and DE_EdgeR() functions of miARma I may have come upon a bug if some (but not all) sample names start with a number instead of a character. I am refering to those code lines from QC_EdgeR():

  • #Importing the data data <- read.table(file, header=TRUE, sep="\t") data<-data[,sort(colnames(data))] #Importing targets targets<-readTargets(targetfile,row.names="Filename") targets<-targets[order(targets$Filename),] *

My understanding is that sort(colnames(data)) and order(targets$Filename) are used to get the same sample order for the matrix data and the dataframe targets. However, if some of the sample names start with a number, the function read.table() adds by default the letter 'X' at the beginning of those names only, which completely messes up the order of all the samples with respect their order in the targets dataframe. As a result, all subsequent analysis (EdgeR) uses a wrongly ordered group character vector for differential analysis and such.

Adding check.names=FALSE in all read.tables() function found in QC_EdgeR() and DE_EdgeR() appears to fix it. I am not an R expert and I apologize if I am one not using QC_EdgeR() properly. Please let me know if I am missing something.

Best, Guillaume Noell

Comments (3)

  1. Eduardo Andres Leon

    Hi, Guillaume. We are assuming that samples names are names and it do not start by numbers (as suggested in the documentation), but what you are recommended is a great solution and it will be included in the coming days. Thanks for your suggestion (we will include a thanks messages within the code).

    Regards

  2. Log in to comment