tuning problems with DIABLO

Create issue
Issue #99 resolved
Kim-Anh Le Cao repo owner created an issue

Dear mixOmics team,

I try to use DIABLO, it's works but I did'nt manage to find a solution to an error that I have for the tuning step. I manage to run steps after (when I give by myself list.keepX) and to obtain all variable plots, but I really want if possible to debug the tuning step. I also encountered same error as I try to run the perf function for my sgssda.res object.

I have two dataset, metabarcoding A and metabolomics B, let see my code:

#################DIABLO
#######site +++

A <- datacounts_g_mean
B <- ions_g_mean
data = list(otus = A, ions = B)
lapply(data, dim)

# $otus
# [1]   12 7647
# 
# $ions
# [1]  12 879

reads_g_c$site = factor(reads_g_c$site, levels=c("Kaw", "Kourou", "Nouragues"))
Y = reads_g_c$site
summary(Y)
# Kaw    Kourou Nouragues 
# 4         4         4 

ncomp = 2
design = matrix(1, ncol = length(data), nrow = length(data), dimnames = list(names(data), names(data)))
diag(design) = 0
design 
#       otus ions
# otus    0    1
# ions    1    0

set.seed(123)
test.keepX = list("otus" = c(5:9, seq(10, 18), seq(20,30)),
                  "ions" = c(5:9, seq(10, 18), seq(20,30)))
tune.TCGA = tune.block.splsda(X = data, Y = Y, ncomp = ncomp, constraint=FALSE,
                              test.keepX = test.keepX, design = design, 
                              dist = "max.dist", cpus=4, near.zero.var = TRUE, 
                              mode = "regression", scale = FALSE)


You have provided a sequence of keepX of length: 25 for block otus and 25 for block ions.
This results in 625 models being fitted for each component and each nrepeat, this may take some time to run, be patient!
As code is running in parallel, the progressBar will only show 100% upon completion of each component.

comp 1 
  |                                                                                    |   0%
Error in checkForRemoteErrors(val) : 
  4 nodes produced errors; first error: valeur manquante là où TRUE / FALSE est requis


?tune.block.splsda
list.keepX = list("otus" = c(50,50), "ions" = c(50,50)) # from tuning step

sgccda.res = block.splsda(X = data, Y = Y, ncomp = ncomp, 
                          keepX = list.keepX, design = design)

...all plots ...

set.seed(123)# for reproducibility
perf.diablo = perf(sgccda.res, validation = 'Mfold', M = 3, nrepeat = 10)
Error in if (max(sapply(1:J, function(x) { : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In perf.sgccda(sgccda.res, validation = "Mfold", M = 3, nrepeat = 10) :
  At least one class is not represented in one fold, which may unbalance the error rate.
  Consider a number of folds lower than the minimum in table(Y): 4

Comments (1)

  1. Log in to comment