Error message of missing data in 'X' and/or 'Y' when neither has missing data
Issue #79
resolved
Hello, I am trying to perform PLS on two datasets "snp" (Genotype data.Rdata file attached) and "met" (Metabolites data.Rdata file attached). I am getting the following error message even though neither of the datasets has any missing data.
load("Genotype data.Rdata") load("Metabolites data.Rdata") cvd.pls <- pls(snp, met, ncomp = 20) cvd.val <- perf(cvd.pls, validation = "Mfold", folds = 5) Error: missing data in 'X' and/or 'Y'. Use 'nipals' for dealing with NAs.
Here the confirmation that neither dataset has any missing values:
any(is.na(met)) [1] FALSE any(is.na(snp)) [1] FALSE
Could you help me with this?
Comments (2)
-
-
- changed status to resolved
- Log in to comment
Hello,
An answer was provided by email. But for anyone out there: this happens when you have constant variables in your data, as one of the first steps of the algorithm is to scale the data (thus constant variables become NA).
Using the parameter near.zero.var = TRUE usually fixes this problem (as does a manual removing of the variables with null variance)