Wiki
Clone wikiTassel 5 Source / UserManual / NumericalImpute / NumericalImpute
Numerical Imputation
Some methods e.g. PCA expect the data to have no missing values. But we often have some missing values which can be of following type 

For phenotypic data, if the value of a trait for a taxon is not known/recorded.

For genotypic data, if the value of some attribute is missing for a SNP.
Two methods are implemented to impute the missing values of data (both phenotypic and genotypic).

Imputation by Mean  here the missing value is replaced by mean of the values for corresponding attribute.

Imputation by knearestneighbors  If data is missing for a taxon for one of the traits, the algorithm finds other taxa (neighbors) that are most like it for the nonmissing traits. It uses the average of the neighbors to impute the missing data. Similarly if the value for some attribute for a SNP is missing, we compute k other SNPs which are most similar to it. The default value of k is 5, which can be changed. Also there are three different choices for distance measure to be for computing nearest neighbors.
 Euclidean
 Manhattan
 Cosine
The Euclidean distance is chosen as default.
Running Numerical Imputation from the command line
Use the ImputationPlugin
Usage: ImputationPlugin <options>
ByMean <true  false> : If imputation is performed by computing mean of the respective column (Default: false)
nearestNeighbors <Number of nearest neighbors to be evaluated> : Choice of k in knearest neighbors algorithm. Default is 5. (Default: 5)
distance <Choose Distance type> : Distance choice for computing nearest neighbors. Default choice is Euclidean distance. (Default: Euclidean)
Updated