valid function performs selection bias

Issue #1 resolved
Kim-Anh Le Cao repo owner created an issue

Amrit has identified a selection biais in the current valid function as it used keepX = (object$loadings$X != 0) keepY = (object$loadings$Y != 0) in the newly added functions spls.model

as a result the same selected variables in the full data set are reused in in the valid function, which leads to overfitting features and largely optimistic error rates.

The only solution is to go back one step forward and change the valid function again as it was before, i.e. no S3 method. Before: function(object, validation = c("Mfold", "loo"), folds = 10, max.iter = 500, tol = 1e-06, ...)

Should be changed to:

function(X, ## change Y, ## change ncomp = min(6, ncol(X)), ## change keepX = NULL, ## change keepY = NULL, ## change mode = "regression", ## change validation = c("Mfold", "loo"), ## change folds = 10, ## change max.iter = 500, ## change tol = 1e-06, ...) ## change

Comments (2)

  1. Log in to comment