Error message of missing data in 'X' and/or 'Y' when neither has missing data

Create issue
Issue #79 resolved
Former user created an issue

Hello, I am trying to perform PLS on two datasets "snp" (Genotype data.Rdata file attached) and "met" (Metabolites data.Rdata file attached). I am getting the following error message even though neither of the datasets has any missing data.

load("Genotype data.Rdata") load("Metabolites data.Rdata") cvd.pls <- pls(snp, met, ncomp = 20) cvd.val <- perf(cvd.pls, validation = "Mfold", folds = 5) Error: missing data in 'X' and/or 'Y'. Use 'nipals' for dealing with NAs.

Here the confirmation that neither dataset has any missing values:

any(is.na(met)) [1] FALSE any(is.na(snp)) [1] FALSE

Could you help me with this?

Comments (2)

  1. Florian Rohart

    Hello,

    An answer was provided by email. But for anyone out there: this happens when you have constant variables in your data, as one of the first steps of the algorithm is to scale the data (thus constant variables become NA).

    Using the parameter near.zero.var = TRUE usually fixes this problem (as does a manual removing of the variables with null variance)

  2. Log in to comment