Bag Distance¶
Bag distance measure
-
class
py_stringmatching.similarity_measure.bag_distance.
BagDistance
[source]¶ Bag distance measure class.
-
get_raw_score
(string1, string2)[source]¶ Computes the bag distance between two strings.
For two strings X and Y, the Bag distance is: \(max( |bag(string1)-bag(string2)|, |bag(string2)-bag(string1)| )\)
Parameters: string1,string2 (str) – Input strings Returns: Bag distance (int) Raises: TypeError
– If the inputs are not stringsExamples
>>> bd = BagDistance() >>> bd.get_raw_score('cat', 'hat') 1 >>> bd.get_raw_score('Niall', 'Neil') 2 >>> bd.get_raw_score('aluminum', 'Catalan') 5 >>> bd.get_raw_score('ATCG', 'TAGC') 0 >>> bd.get_raw_score('abcde', 'xyz') 5
References
- String Matching with Metric Trees Using an Approximate Distance: http://www-db.disi.unibo.it/research/papers/SPIRE02.pdf
-
get_sim_score
(string1, string2)[source]¶ Computes the normalized bag similarity between two strings.
Parameters: string1,string2 (str) – Input strings Returns: Normalized bag similarity (float) Raises: TypeError
– If the inputs are not stringsExamples
>>> bd = BagDistance() >>> bd.get_sim_score('cat', 'hat') 0.6666666666666667 >>> bd.get_sim_score('Niall', 'Neil') 0.6 >>> bd.get_sim_score('aluminum', 'Catalan') 0.375 >>> bd.get_sim_score('ATCG', 'TAGC') 1.0 >>> bd.get_sim_score('abcde', 'xyz') 0.0
References
- String Matching with Metric Trees Using an Approximate Distance: http://www-db.disi.unibo.it/research/papers/SPIRE02.pdf
-