Levenshtein

class py_stringmatching.similarity_measure.levenshtein.Levenshtein[source]

Computes Levenshtein measure (also known as edit distance).

Levenshtein distance computes the minimum cost of transforming one string into the other. Transforming a string is carried out using a sequence of the following operators: delete a character, insert a character, and substitute one character for another.

get_raw_score(string1, string2)[source]

Computes the raw Levenshtein distance between two strings.

Parameters:string1,string2 (str) – Input strings.
Returns:Levenshtein distance (int).
Raises:TypeError – If the inputs are not strings.

Examples

>>> lev = Levenshtein()
>>> lev.get_raw_score('a', '')
1
>>> lev.get_raw_score('example', 'samples')
3
>>> lev.get_raw_score('levenshtein', 'frankenstein')
6
get_sim_score(string1, string2)[source]

Computes the normalized Levenshtein similarity score between two strings.

Parameters:string1,string2 (str) – Input strings.
Returns:Normalized Levenshtein similarity (float).
Raises:TypeError – If the inputs are not strings.

Examples

>>> lev = Levenshtein()
>>> lev.get_sim_score('a', '')
0.0
>>> lev.get_sim_score('example', 'samples')
0.5714285714285714
>>> lev.get_sim_score('levenshtein', 'frankenstein')
0.5