Triggers¶
-
class
py_entitymatching.
MatchTrigger
[source]¶ -
add_action
(value)[source]¶ Adds an action to the match trigger. If the result of a rule is the same value as the condition status, then the action will be carried out. The condition status can be added with the function add_cond_status.
- Args:
value (integer): The action. Currently only the values 0 and 1 are supported.
- Examples:
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f) >>> mt.add_cond_status(True) >>> mt.add_action(1)
-
add_cond_rule
(conjunct_list, feature_table, rule_name=None)[source]¶ Adds a rule to the match trigger.
- Parameters
conjunct_list (list) – A list of conjuncts specifying the rule.
feature_table (DataFrame) – A DataFrame containing all the features that are being referenced by the rule (defaults to None). If the feature_table is not supplied here, then it must have been specified during the creation of the rule-based blocker or using set_feature_table function. Otherwise an AssertionError will be raised and the rule will not be added to the rule-based blocker.
rule_name (string) – A string specifying the name of the rule to be added (defaults to None). If the rule_name is not specified then a name will be automatically chosen. If there is already a rule with the specified rule_name, then an AssertionError will be raised and the rule will not be added to the rule-based blocker.
- Returns
The name of the rule added (string).
- Raises
AssertionError – If rule_name already exists.
AssertionError – If feature_table is not a valid value parameter.
Examples
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f)
-
add_cond_status
(status)[source]¶ Adds a condition status to the match trigger. If the result of a rule is the same value as the condition status, then the action will be carried out. The action can be added with the function add_action.
- Args:
status (boolean): The condition status.
- Examples:
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f) >>> mt.add_cond_status(True) >>> mt.add_action(1)
-
delete_rule
(rule_name)[source]¶ Deletes a rule from the match trigger.
- Parameters
rule_name (string) – Name of the rule to be deleted.
Examples
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f) >>> mt.delete_rule('rule_1')
-
execute
(input_table, label_column, inplace=True, verbose=False)[source]¶ Executes the rules of the match trigger for a table of matcher results.
- Parameters
input_table (DataFrame) – The input table of type pandas DataFrame containing tuple pairs and labels from matching (defaults to None).
label_column (string) – The attribute name where the predictions are stored in the input table (defaults to None).
inplace (boolean) – A flag to indicate whether the append needs to be done inplace (defaults to True).
verbose (boolean) – A flag to indicate whether the debug information should be logged (defaults to False).
- Returns
A DataFrame with predictions updated.
Examples
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f) >>> mt.add_cond_status(True) >>> mt.add_action(1) >>> # The table H is a table with prediction labels generated from matching >>> mt.execute(input_table=H, label_column='predicted_labels', inplace=False)
-
get_rule
(rule_name)[source]¶ Returns the function corresponding to a rule.
- Parameters
rule_name (string) – Name of the rule.
- Returns
A function object corresponding to the specified rule.
Examples
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f) >>> mt.get_rule()
-
get_rule_names
()[source]¶ Returns the names of all the rules in the match trigger.
- Returns
A list of names of all the rules in the match trigger (list).
Examples
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f) >>> mt.get_rule_names()
-
set_feature_table
(feature_table)[source]¶ Sets feature table for the match trigger.
- Parameters
feature_table (DataFrame) – A DataFrame containing features.
Examples
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> mt.set_feature_table(match_f)
-
view_rule
(rule_name)[source]¶ Prints the source code of the function corresponding to a rule.
- Parameters
rule_name (string) – Name of the rule to be viewed.
Examples
>>> import py_entitymatching as em >>> mt = em.MatchTrigger() >>> A = em.read_csv_metadata('path_to_csv_dir/table_A.csv', key='id') >>> B = em.read_csv_metadata('path_to_csv_dir/table_B.csv', key='id') >>> match_f = em.get_features_for_matching(A, B) >>> rule = ['title_title_lev_sim(ltuple, rtuple) > 0.7'] >>> mt.add_cond_rule(rule, match_f) >>> mt.view_rule('rule_1')
-