error le systeme est exactement singulier with tune.block.splsda

Issue #138 resolved
GENET Carine
created an issue

Dear all, I obtained the following error when tuning block splsda and was not able to find what's wrong with my data. Here my code and attached my files. I use the nearzerovar parameters but didn't help. Thanks in advance Regards Carine

```{r dataload} rm(list=ls()) library(RColorBrewer) library(mixOmics) pheno<-read.csv("phenoDM.csv", sep=";", header=TRUE, row.names = 1) count<-read.csv("count657_DEG_2018.csv", sep=";", header = TRUE, row.names=1) count$treatment<-(c(rep("CTRL",4),rep("TESTO",4)))

data= list(RNAseq=count[,-658], pheno=pheno) lapply(data,dim) $RNAseq [1] 8 657

$pheno [1] 8 48

design<-matrix(0.1, ncol=length(data), nrow=length(data), dimnames=list(names(data), names(data))) diag(design)=0 design RNAseq pheno RNAseq 0.0 0.1 pheno 0.1 0.0 sgccda.res = block.splsda(X = data, Y = count$treatment, ncomp = 3, design = design)

Design matrix has changed to include Y; each block will be linked to Y. perf.diablo = perf(sgccda.res, validation = 'loo') plot(perf.diablo) perfM.diablo=perf(sgccda.res, validation='Mfold', folds=4, nrepeat= 100) Warning messages: 1: closing unused connection 4 (<-tls-gps-cgenet.inra.local:11185) 2: closing unused connection 3 (<-tls-gps-cgenet.inra.local:11185) plot(perfM.diablo) perfM.diablo$choice.ncomp$WeightedVote max.dist centroids.dist mahalanobis.dist Overall.ER 1 2 1 Overall.BER 1 2 1 ncomp=perfM.diablo$choice.ncomp$WeightedVote["Overall.ER", "centroids.dist"] ncomp [1] 2 test.keepX = list (RNAseq=c(10:20, seq(20,50,5)), pheno=c(5:20)) test.keepX $RNAseq [1] 10 11 12 13 14 15 16 17 18 19 20 20 25 30 35 40 45 50

$pheno [1] 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

tune.TCGA = tune.block.splsda(X = data, Y = count$treatment, ncomp = 2, test.keepX = test.keepX, design = design, validation = 'loo',cpus=2, dist = "centroids.dist", near.zero.var = TRUE)

You have provided a sequence of keepX of length: 18 for block RNAseq and 16 for block >pheno. This results in 288 models being fitted for each component and each nrepeat, this may take >some time to run, be patient! As code is running in parallel, the progressBar will only show 100% upon completion of >each nrepeat/ component.

comp 1 | | 0%Error in checkForRemoteErrors(val) : 2 nodes produced errors; first error: routine Lapack dgesv : le système est exactement singulier : U[2,2] = 0

Comments (6)

  1. Kim-Anh Le Cao repo owner

    Hi Carine, You could try a more 'spaced' grid for your test.keepX and only one component. It seems that it crashes on component 1 anyway. Since perf is working OK this is not an issue of data input or format. If that fails, yes, send use your data + script at mixomics [at] math.univ-toulouse.fr for debugging on our end.

    Kim-Anh

  2. Kim-Anh Le Cao repo owner

    Hi Carine, Other option is to use validation='Mfold', folds=4 in your tune function, similar to what you have used for perf. Also remove cpus = 2. We are pushing an update in the next few days, perhaps that would also help.

  3. GENET Carine reporter

    Hi Kim-anh, Thank for your advice. I used Mfold but it failed (that why I used loo validation because I already have this kind of error and loo validation help). I also reduced ncomp to 1 and used a more spaced test.keepx grid. and it also failed. Finally, I started all over again and proceed like this : I used a set of pheno data which previously works with diablo analysis (8 phenotype data selected by sPLSDA analysis) and add one by one the pheno data I want to analyse..... Finally I discarded some pheno data and it works. So my understanding was that something went wrong with my pheno data but was not able to find which one(s). Regards Carine

  4. Log in to comment