Steps of Supported EM WorkflowsΒΆ
- Reading the CSV Files from Disk
- Down Sampling
- Profiling Data
- Data Exploration
- Specifying Blockers and Performing Blocking
- Creating Features for Blocking
- Available Tokenizers and Similarity Functions
- Obtaining Tokenizers and Similarity Functions
- Obtaining Attribute Types and Correspondences
- Getting a Set of Features
- Adding/Removing Features
- Summary of the Manual Feature Generation Process
- Ways to Edit the Manual Feature Generation Process
- Generating Features Automatically
- Debugging Blocking
- Sampling
- Labeling
- Splitting Labeled Data into Training and Testing Sets
- Creating Features for Matching
- Extracting Feature Vectors
- Imputing Missing Values
- Specifying Matchers and Performing Matching
- Selecting a ML-Matcher
- Debugging ML-Matchers
- Combining Predictions from Multiple Matchers
- Using Triggers to Update Matching Results
- Evaluating the Matching Output