Crate rust_fuzzy_search[−][src]
This crate implements Fuzzy Searching with trigrams
Fuzzy searching allows to compare strings by similarity rather then by equality:
Similar strings will get a high score (close to 1.0f32
) while dissimilar strings will get a lower score (closer to 0.0f32
).
Fuzzy searching tolerates changes in word order:
ex. "John Dep"
and "Dep John"
will get a high score.
The crate exposes 4 main functions:
- fuzzy_compare will take 2 strings and return a score representing how similar those strings are.
- fuzzy_search applies fuzzy_compare to a list of strings and returns a list of tuples: (word, score).
- fuzzy_search_sorted is similar to fuzzy_search but orders the output in descending order.
- fuzzy_search_threshold will take an additional
f32
as input and returns only tuples with score greater than the threshold. - fuzzy_search_best_n will take an additional
usize
arguments and returns the firstn
tuples.
The Algorithm used is taken from : https://dev.to/kaleman15/fuzzy-searching-with-postgresql-97o
Basic idea:
-
From both strings extracts all groups of 3 adjacent letters.
("House"
becomes[' H', ' Ho', 'Hou', 'ous', 'use', 'se ']
).
Note the 2 spaces added to the head of the string and the one on the tail, used to make the algorithm work on zero length words. -
Then counts the number of trigrams of the first words that are also present on the second word and divide by the number of trigrams of the first word.
Example: Comparing 2 strings
fn test () { use rust_fuzzy_search::fuzzy_compare; let score : f32 = fuzzy_compare("kolbasobulko", "kolbasobulko"); println!("score = {:?}", score); }Run
Example: Comparing a string with a list of strings and retrieving only the best matches
fn test() { use rust_fuzzy_search::fuzzy_search_best_n; let s = "bulko"; let list : Vec<&str> = vec![ "kolbasobulko", "sandviĉo", "ŝatas", "domo", "emuo", "fabo", "fazano" ]; let n : usize = 3; let res : Vec<(&str, f32)> = fuzzy_search_best_n(s,&list, n); for (_word, score) in res { println!("{:?}",score) } }Run
Functions
fuzzy_compare | Use this function to compare 2 strings. |
fuzzy_search | Use this function to compare a string ( |
fuzzy_search_best_n | This function is similar to fuzzy_search_sorted but keeps only the |
fuzzy_search_sorted | This function is similar to fuzzy_search but sorts the result in descending order (the best matches are placed at the beginning). |
fuzzy_search_threshold | This function is similar to fuzzy_search but filters out element with a score lower than the specified one. |