Wiki
Clone wikiHash2Vec / Distance
Distance in Hash2Vec
Hash2Vec uses Cosine Similarity for close vectors
Hash2Vec Documentation:
When finding the distance in the algorithm Hash2Vec need to take note of process vectorization in numerical vector.
If vectors have not been normalized to single or decimal vectors, the results of finding the distance may be incorrect.
That's why method Distance have next parameters: count and accuracy (default accuracy 1).
If the vectors are normalized to single or decimal vectors, then the result will be more correct and then the accuracy will be 0.
#!c# var vocabulary = new Hash2VecBinaryReader().Read("InputFile"); //InputFile - vectorization vector var distanceList = vocabulary.Distance("test", count:50, accuracy:2).ToList(); distanceList.ForEach(dis => Console.WriteLine("{0}\t\t ||{1,10:F6}", dis.Representation.WordOrNull, dis.DistanceValue));
Updated