Issues with Zing AI fuzzy matching and training data generation despite exact match config on province field
Hi Sonal G., I'm currently exploring Zing AI and have set it up on a single-machine environment. I'm following the same process you demonstrated in your video, and my config file includes the fields as shown below. Using an exact match on name, address lines, and pincode, I was able to identify around 1.73% duplicate records in python. However, I expected Zing's fuzzy matching to surface more duplicates. But at the very first step - while generating training data for labeling - I’m not seeing any valid matches. Despite setting the "province" to "exact match" in the training config, the examples returned are completely unrelated and clearly not matches. Here’s a snapshot of my current config file for reference: { "fieldDefinition":[ { "fieldName" : "id", "matchType" : "dont_use", "fields" : "id", "dataType": "int" }, { "fieldName" : "email", "matchType" : "fuzzy", "fields" : "email", "dataType": "string" }, { "fieldName" : "name", "matchType" : "fuzzy", "fields" : "name", "dataType": "string" }, { "fieldName" : "phone", "matchType" : "dont_use", "fields" : "phone", "dataType": "string" }, { "fieldName" : "country", "matchType" : "exact", "fields" : "country", "dataType": "string" }, { "fieldName" : "province", "matchType" : "exact", "fields" : "province", "dataType": "string" }, { "fieldName" : "zip", "matchType" : "fuzzy", "fields" : "zip", "dataType": "string" }, { "fieldName" : "city", "matchType" : "fuzzy", "fields" : "city", "dataType": "string" }, { "fieldName" : "address", "matchType" : "fuzzy", "fields" : "address", "dataType": "string" } ], I’ve retried the training data generation and labeling steps multiple times (by running same commands again and again) with no success. Could you help me understand what I might be missing? Thanks.