error le systeme est exactement singulier with tune.block.splsda

Issue #138 resolved
GENET Carine created an issue

Dear all, I obtained the following error when tuning block splsda and was not able to find what's wrong with my data. Here my code and attached my files. I use the nearzerovar parameters but didn't help. Thanks in advance Regards Carine

```{r dataload} rm(list=ls()) library(RColorBrewer) library(mixOmics) pheno<-read.csv("phenoDM.csv", sep=";", header=TRUE, row.names = 1) count<-read.csv("count657_DEG_2018.csv", sep=";", header = TRUE, row.names=1) count$treatment<-(c(rep("CTRL",4),rep("TESTO",4)))

data= list(RNAseq=count[,-658], pheno=pheno) lapply(data,dim) $RNAseq [1] 8 657

$pheno [1] 8 48

design<-matrix(0.1, ncol=length(data), nrow=length(data), dimnames=list(names(data), names(data))) diag(design)=0 design RNAseq pheno RNAseq 0.0 0.1 pheno 0.1 0.0 sgccda.res = block.splsda(X = data, Y = count$treatment, ncomp = 3, design = design)

Design matrix has changed to include Y; each block will be linked to Y. perf.diablo = perf(sgccda.res, validation = 'loo') plot(perf.diablo) perfM.diablo=perf(sgccda.res, validation='Mfold', folds=4, nrepeat= 100) Warning messages: 1: closing unused connection 4 (<-tls-gps-cgenet.inra.local:11185) 2: closing unused connection 3 (<-tls-gps-cgenet.inra.local:11185) plot(perfM.diablo) perfM.diablo$choice.ncomp$WeightedVote max.dist centroids.dist mahalanobis.dist Overall.ER 1 2 1 Overall.BER 1 2 1 ncomp=perfM.diablo$choice.ncomp$WeightedVote["Overall.ER", "centroids.dist"] ncomp [1] 2 test.keepX = list (RNAseq=c(10:20, seq(20,50,5)), pheno=c(5:20)) test.keepX $RNAseq [1] 10 11 12 13 14 15 16 17 18 19 20 20 25 30 35 40 45 50

$pheno [1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

tune.TCGA = tune.block.splsda(X = data, Y = count$treatment, ncomp = 2, test.keepX = test.keepX, design = design, validation = 'loo',cpus=2, dist = "centroids.dist", = TRUE)

You have provided a sequence of keepX of length: 18 for block RNAseq and 16 for block >pheno. This results in 288 models being fitted for each component and each nrepeat, this may take >some time to run, be patient! As code is running in parallel, the progressBar will only show 100% upon completion of >each nrepeat/ component.

comp 1 | | 0%Error in checkForRemoteErrors(val) : 2 nodes produced errors; first error: routine Lapack dgesv : le système est exactement singulier : U[2,2] = 0

  1. GENET Carine reporter

    I can send you the files but want to keep them private... where should I have to send them ? Regards

  2. Kim-Anh Le Cao repo owner

    Hi Carine, You could try a more 'spaced' grid for your test.keepX and only one component. It seems that it crashes on component 1 anyway. Since perf is working OK this is not an issue of data input or format. If that fails, yes, send use your data + script at mixomics [at] for debugging on our end.


  3. Kim-Anh Le Cao repo owner

    Hi Carine, Other option is to use validation='Mfold', folds=4 in your tune function, similar to what you have used for perf. Also remove cpus = 2. We are pushing an update in the next few days, perhaps that would also help.

  4. GENET Carine reporter

    Hi Kim-anh, Thank for your advice. I used Mfold but it failed (that why I used loo validation because I already have this kind of error and loo validation help). I also reduced ncomp to 1 and used a more spaced test.keepx grid. and it also failed. Finally, I started all over again and proceed like this : I used a set of pheno data which previously works with diablo analysis (8 phenotype data selected by sPLSDA analysis) and add one by one the pheno data I want to analyse..... Finally I discarded some pheno data and it works. So my understanding was that something went wrong with my pheno data but was not able to find which one(s). Regards Carine

