Thanks! We tried to use the 10M labels in Glue with 64 G.8X workers but it failed after 10min:
An error occurred while calling o166.execute. Job aborted due to stage failure: ResultStage 51 (collectAsList at BlockingTreeUtil.java:52) has failed the maximum allowable number of times: 4
We tried with a much smaller set of labels, around 70k, using 32 G.4X workers and it was still running after 11h so we stopped it.
Our label dataset contains multiple labels per cluster. Would that be an issue? Should we have our labels organized in pairs?