Jaro¶
-
class
py_stringmatching.similarity_measure.jaro.
Jaro
[source]¶ Computes Jaro measure.
The Jaro measure is a type of edit distance, developed mainly to compare short strings, such as first and last names.
-
get_raw_score
(string1, string2)[source]¶ Computes the raw Jaro score between two strings.
- Parameters
string1 (str) – Input strings.
string2 (str) – Input strings.
- Returns
Jaro similarity score (float).
- Raises
TypeError – If the inputs are not strings or if one of the inputs is None.
Examples
>>> jaro = Jaro() >>> jaro.get_raw_score('MARTHA', 'MARHTA') 0.9444444444444445 >>> jaro.get_raw_score('DWAYNE', 'DUANE') 0.8222222222222223 >>> jaro.get_raw_score('DIXON', 'DICKSONX') 0.7666666666666666
-
get_sim_score
(string1, string2)[source]¶ Computes the normalized Jaro similarity score between two strings. Simply call get_raw_score.
- Parameters
string1 (str) – Input strings.
string2 (str) – Input strings.
- Returns
Normalized Jaro similarity score (float).
- Raises
TypeError – If the inputs are not strings or if one of the inputs is None.
Examples
>>> jaro = Jaro() >>> jaro.get_sim_score('MARTHA', 'MARHTA') 0.9444444444444445 >>> jaro.get_sim_score('DWAYNE', 'DUANE') 0.8222222222222223 >>> jaro.get_sim_score('DIXON', 'DICKSONX') 0.7666666666666666
-