Soundex¶
Soundex phonetic similarity measure
-
class
py_stringmatching.similarity_measure.soundex.
Soundex
[source]¶ Soundex phonetic similarity measure class.
-
get_raw_score
(string1, string2)[source]¶ Computes the Soundex phonetic similarity between two strings.
Phonetic measure such as soundex match string based on their sound. These measures have been especially effective in matching names, since names are often spelled in different ways that sound the same. For example, Meyer, Meier, and Mire sound the same, as do Smith, Smithe, and Smythe.
Soundex is used primarily to match surnames. It does not work as well for names of East Asian origins, because much of the discriminating power of these names resides in the vowel sounds, which the code ignores.
Parameters: string1,string2 (str) – Input strings Returns: Soundex similarity score (int) is returned Raises: TypeError
– If the inputs are not stringsExamples
>>> s = Soundex() >>> s.get_raw_score('Robert', 'Rupert') 1 >>> s.get_raw_score('Sue', 's') 1 >>> s.get_raw_score('Gough', 'Goff') 0 >>> s.get_raw_score('a,,li', 'ali') 1
-
get_sim_score
(string1, string2)[source]¶ Computes the normalized soundex similarity between two strings.
Parameters: string1,string2 (str) – Input strings Returns: Normalized soundex similarity (int) Raises: TypeError
– If the inputs are not strings or if one of the inputs is None.Examples
>>> s = Soundex() >>> s.get_sim_score('Robert', 'Rupert') 1 >>> s.get_sim_score('Sue', 's') 1 >>> s.get_sim_score('Gough', 'Goff') 0 >>> s.get_sim_score('a,,li', 'ali') 1
-