Guides¶
The goal of this page is to give you some concrete examples for using py_entitymatching. These are examples with sample data that is already bundled along with the package. The examples are in the form of Jupyter notebooks.
A Quick Tour of Jupyter Notebook¶
This tutorial gives a quick tour on installing and using Jupyter notebook.
End-to-End EM Workflows¶
EM workflow with blocking using a overlap blocker and matching using Random Forest matcher: Jupyter notebook
EM workflow with blocking using a overlap blocker, selecting among multiple matchers, using the selected matcher to predict matches, and evaluating the predicted matches: Jupyter notebook
EM workflow with blocking using multiple blockers (overlap and attribute equivalence blocker), debugging the blocker output, selecting among multiple matchers, debugging the matcher output, using the selected matcher to predict matches, and evaluating the predicted matches: Jupyter notebook
Stepwise Guides¶
Reading CSV files from disk: Jupyter notebook
Down sampling: Jupyter notebook
Data profiling: Jupyter notebook
Data exploration: Jupyter notebook
Blocking:
Using overlap blocker: Jupyter notebook
Using attribute equivalence blocker: Jupyter notebook
Using rule-based blocker: Jupyter notebook
Using blackbox blocker: Jupyter notebook
Combining multiple blockers: Jupyter notebook
Debugging blocker output: Jupyter notebook
Handling features:
Generating features manually: Jupyter notebook
Editing attribute types and generating features manually: Jupyter notebook
Adding features to feature table: Jupyter notebook
Removing features from feature table: Jupyter notebook
Sampling and labeling: Jupyter notebook
Matching:
Selecting the best learning-based matcher (involves splitting the labeled data, generating features, instantiating multiple matchers, debugging the matcher output): Jupyter notebook
Performing matching using rule-based matcher: Jupyter notebook
Improving matching results using triggers: Jupyter notebook
Evaluating the predictions from a matcher: Jupyter notebook