Could you please check whether each cluster in the training data you are adding has a distinctive z_cluster in case of a match or not a match?
Also, could you cross-verify the pair counts are accurate for each cluster in your training data with n*(n-1)/2 -> A cluster with 3 records should have 3 pairs and a cluster with 4 records should have 4*3/2 = 6 pairs