Error: perf function randomly returning NaNs/NAs for MSEP, R2
I am attempting to use the perf function to tune using 10-fold crossvalidation. I noticed that the function sometimes randomly returns NaNs/NAs when calculating the R2 or MSEP. At times, the output is a matrix of values, at other times, it is instead full of NAs--all when using identical inputs and parameters. None of my input data contains any missing values.
I am simply running the following lines of code:
pls_class_16S=pls(table16Sclass,predmat16S,ncomp=2,mode=c("regression"),near.zero.var=TRUE) spls_class_16S=spls(table16Sclass,predmat16S,ncomp=2,keepX=c(10,10),mode=c("regression"),near.zero.var=TRUE)
tune.pls_class_16S=perf(pls_class_16S,validation="Mfold",folds=10,progressBar=FALSE,criterion = 'all',nrepeat=50) tune.spls_class_16S=perf(spls_class_16S,validation="Mfold",folds=10,progressBar=FALSE,criterion = 'all',nrepeat=50)
tune.pls_class_16S$Q2.total tune.spls_class_16S$Q2.total
tune.pls_class_16S$R2 tune.spls_class_16S$R2
Comments (3)
-
-
Account Deactivated dim(table16Sclass) [1] 23 68 dim(predmat16S) [1] 23 9
Hi Florian. (I am the user who created this issue!) I would be happy to send this data to you for debugging purposes. Do you have an email or file sharing account I can send it to?
Leave-One-Out appears to work with no problems. I have tried leave-one-out cross-validation multiple times and it appears to not produce this issue.
tune.pls_class_16S=perf(pls_class_16S,validation="loo",progressBar=FALSE,criterion = 'all') tune.spls_class_16S=perf(spls_class_16S,validation="loo",progressBar=FALSE,criterion = 'all')
-
- changed status to resolved
Thanks for your email. This will be fixed in the next release v 6.1.3
FYI commit 3d2039c
- Log in to comment
Hi there,
A bug that is randomly appearing is hard to identify and solve. What are the dimension of your data (table16Sclass, predmat16S)? It seems to be data-dependent (as first time it's been reported), and I would guess it's probably -once in a while- a fold is created and contains NA after scaling. Who you be ok sending the data to us, only for debugging purpose ?
Otherwise, do you have the same problem when you perform Leave-One-Out?