Understanding miARma-seq plot

Issue #84 resolved
Sebastien created an issue

Hi,

I got some plots with miARma-seq and I really don't understand how it makes the clustering. I attach some pictures so you can see that miARma-seq plots are not consistent with my data.

The clustering file is the clustering done by miARma-seq. You can see on it that the Q2 sample is on the same branch that CE1 and CE2. But if you look at the heatmap done by DESeq on the same data that Q1 and Q2 are related together and Q2 is not related to CE1 and CE2.

I did a scatter-plot with EdgeR to see what happen and the correlation between Q2 and CE1 or CE2 is lesser than the correlation between Q1 and Q2.

I don't know if I'm clear with my explanation. But this results are a bit hard to understand.

Thank you

Sebastien

Comments (5)

  1. Eduardo Andres Leon

    Hi, inside the lib folder, there is another one called R-scripts. Inside you can take a look to the code implemented to create those clustering.

    In my experience, all quality plots done with edgeR and DeSeq2 are complete equal (and I have donde hundreds of them). I only found differences in the final DEGs. So I guess the problem is related with the filters (cpm to remove, we remove low expressed genes according with the value you provided .. we take into account replicates ...) So I suggest you to use the same kind of filters in both cases, in such a way you will find that both of them create the same heatmaps

  2. Sebastien reporter

    Hi, can you give me informations on the filters you apply on data? I'm checking the R script but I don't understand your filters. That's maybe cause I only know basic R.

    Your filters are implemented here?

      #Filtering the counts
      if(filter=="yes"){
        keep<-rowSums(cpm(dge)>cpmvalue) >= repthreshold
        dge<-dge[keep, ]
        #Recalculating the library size
        dge$samples$lib.size <-colSums(dge$counts)
      }
    

    Sebastien

  3. Eduardo Andres Leon

    Hi Sebastien, Yes, this is the filter in order to remove low expressed genes/miRNAs. According with http://www.nature.com/nprot/journal/v8/n9/full/nprot.2013.099.html , all genes expressed below 1 cpm in most of the samples should be deleted in your experiment. So the line you are highlighting is from edgeR guide, it removes (if filter=yes in ini file) all genes with less that cpmvalue reads in at least repthreshold samples. You should adjust this depending of your experiment setup.

    Edu

  4. Log in to comment