DIABLO perf multiple issues: Error in max(temp[, 2]) : invalid 'type' (list) of argument

Issue #122 resolved

Jen Modliszewski created an issue 2018-01-18

Hi,

I am trying to run DIABLO on a RNA-Seq and methyl-array dataset. I'm running into a couple of issues when running perf. I am wondering if anyone has any insight into what is going on?

When running perf, I get this error: Error in max(temp[, 2]) : invalid 'type' (list) of argument
And also this multiple times, although it looks like it is possibly due to low/no variability in variables based on some previous issues posted here? 1: In cor(Ak, variates.Ak) : the standard deviation is zero
Finally, I sometimes also see this, which also happens when using rCCA The SGCCA algorithm did not converge

Some details on the data sets/design:

lapply(data, dim)

$rna

[1] 23 15846

$methyl

[1] 23 242714

summary(Y)

Aged_Lean Aged_Obese Young_Lean Young_Obese

6 5 5 7

rna methyl

rna 0.0 0.1

methyl 0.1 0.0

diablo.integrated.sgccda.res = block.splsda(X = data, Y = Y, ncomp = 5, design = design)

Thank you for any help/comments. Jen

Comments (9)

Jen Modliszewski reporter
- edited description
- 2018-01-18T05:38:44+00:00
Florian Rohart
Hi Jen,
1. Could you run a traceback() after obtaining the bug in perf?
2. This warnings comes from low/no variability in parameters as you mentioned. You can try to remove them by adding near.zero.var=TRUE when calling diablo/block.splsda
3. sometimes the algorithm does not converge with the default number of iterations (100). You could try to increase this number (max.iter); this depends on your data...
Thanks!
- 2018-01-23T01:16:51+00:00

Jen Modliszewski reporter

Hi Florian,

Thanks so much for your response!

Unfortunately, it looks like setting the near.zero.var = TRUE and increasing the max.iter(to 10000) did not help in my case.

The output from traceback() is below.

13: which(temp[, 2] == max(temp[, 2]))
12: FUN(newX[, i], ...)
11: apply(x, c(1, 2), function(z) {
        temp = aggregate(object$weights, list(z), sum)
        ind = which(temp[, 2] == max(temp[, 2]))
        if (length(ind) == 1) {
            res = temp[ind, 1]
        }
        else {
            res = NA
        }
        res
    })
10: FUN(X[[i]], ...)
9: lapply(temp.all, function(x) {
       apply(x, c(1, 2), function(z) {
           temp = aggregate(object$weights, list(z), sum)
           ind = which(temp[, 2] == max(temp[, 2]))
           if (length(ind) == 1) {
               res = temp[ind, 1]
           }
           else {
               res = NA
           }
           res
       })
   })
8: unlist(lapply(temp.all, function(x) {
       apply(x, c(1, 2), function(z) {
           temp = aggregate(object$weights, list(z), sum)
           ind = which(temp[, 2] == max(temp[, 2]))
           if (length(ind) == 1) {
               res = temp[ind, 1]
           }
           else {
               res = NA
           }
           res
       })
   }))
7: array(unlist(lapply(temp.all, function(x) {
       apply(x, c(1, 2), function(z) {
           temp = aggregate(object$weights, list(z), sum)
           ind = which(temp[, 2] == max(temp[, 2]))
           if (length(ind) == 1) {
               res = temp[ind, 1]
           }
           else {
               res = NA
           }
           res
       })
   })), dim(Y.hat[[1]]), dimnames = list(rownames(newdata[[1]]), 
       colnames(Y), paste("dim", c(1:min(ncomp[-object$indY])), 
           sep = " ")))
6: predict.block.spls(model[[x]], X.test[[x]], dist = "all")
5: predict(model[[x]], X.test[[x]], dist = "all")
4: FUN(X[[i]], ...)
3: lapply(1:M, function(x) {
       predict(model[[x]], X.test[[x]], dist = "all")
   })
2: perf.sgccda(diablo.integrated.sgccda.res3, validation = "Mfold", 
       folds = 3, nrepeat = 10, progressBar = TRUE)
1: perf(diablo.integrated.sgccda.res3, validation = "Mfold", folds = 3, 
       nrepeat = 10, progressBar = TRUE)

Thanks again for your help! Jen

2018-01-24T16:23:57+00:00

Florian Rohart
Hi Jen,

Sorry for the delay. I can't replicate this problem, so it's probably something specific with your data. Would you send me (a part) of your data (debugging purposes only) so it's easier for me to debug and I can send you a fix asap. f.rohart at uq.edu.au

thanks!
- 2018-04-12T05:27:10+00:00
Christina Adler
Hi Florian,

Apologies for highjacking this thread, but I was wondering if the issue was resolved?

I am running into the same problem, same error with perf.diablo. Tried the increase in max.iterations and near.zero.var=TRUE with no improvement.

Looking for assistance to see if it is just my data/low variability (it is a pilot dataset of small number, n=17, for 16S and ITS) or if it can be resolved.

Thanks in advance for any assistance

Christina
- 2018-10-07T07:55:27+00:00
Florian Rohart
Hi Christina,

issue wasn't fixed as it wasn't identified/replicated on my end.. Can't fix something without knowing where the problem lies :) I'd be very grateful if you could send me your data - will only be for debugging purposes only

Thanks!
- 2018-10-09T23:22:41+00:00
Christina Adler
Hi Florian,

Thanks heaps for the reply! It may just be my dodgy data!

I will send it through to you, thanks again for any assistance, your time is greatly appreciated!

Christina
- 2018-10-09T23:26:55+00:00
Jen Modliszewski reporter
Hi there - sorry for abandoning this thread! I just wanted to add that I do think it was one of the data sets not having enough variability despite my attempts to filter for that. I have since done this same analysis with several other data sets with no issue. The dataset causing the issue was a bisulfite-methyl seq data set. I never got to the bottom of the issue, since I was using the data set to learn the tool.
- 2018-10-10T18:54:49+00:00
Al J Abadi
- changed status to resolved
reason explained by OP
- 2019-08-15T02:30:27+00:00
Log in to comment

Assignee: –

Type: bug

Priority: minor

Status: resolved

Votes: 0

Watchers: 1