Wiki
Clone wikiHash2Vec / FuzzySearch
Fuzzy Search in Hash2Vec
Fuzzy Search implemented in the Hash2Vec library -> namespace ServiceManager.FuzzySearch
Hash2Vec Documentation:
The classic implementation of fuzzy search in the Hash2Vec algorithm is constructed by finding a given arithmetic distance for a given search line, that is distance is calculated for each term of a word in a line.
Example classic realization method Fuzzy Search in Hash2Vec (Hash2VecDistance)
#!c# public static double Hash2VecDistance(this string str, string comparison) { var coefficient = 0.0; var vocabulary = VectorizeData(str); var comparisonWords = comparison.Split(' ','.',','); var distanceToList = new List<DistanceTo>(); foreach (var word in comparisonWords) { // Get a collection of distances for a word var distanceList = vocabulary.Distance(word, count:10, accuracy:0).ToList(); // Check the same vector, if true then the distance is 1.0 var sameVector = vocabulary.Words.FirstOrDefault(w => w.NumericVector.EqualsVectors(Hash2Vec.GetHashVector(word))); if (sameVector != null) distanceList.Add(new DistanceTo(sameVector,1.0)); // If the result is more than 0.75 then add to the collection var result = distanceList.OrderByDescending(dis => dis.DistanceValue).FirstOrDefault(dis => dis.DistanceValue > 0.75); distanceToList.Add(result); } // Calculate similarity distance distanceToList.ForEach(dis => { if (dis != null) coefficient += dis.DistanceValue; }); return coefficient/distanceToList.Count; }
Hash2Vec contains two methods fuzzy search:
- Hash2VecDistance
- Hash2VecDistanceCorrect
You can also implement your methods based on distance
Example test Fuzzy Search in Hash2Vec
#!c# var input = "молоко домик в деревне"; var name = "молоко в деревне"; var distHash2Vec = input.Hash2VecDistance(name); if (distHash2Vec > 0.55) //Result greater than 55 good result Console.WriteLine("\t{0:###,###.00000} against {1}", distHash2Vec, name);
Updated