Getting some awful skews when trying to scale up to 20m records. I've tried:
- 1.
Scaling up the partitions
- 2.
Scaling down the partitions
- 3.
Scaling up the number of workers
- 4.
Using spark.sql.adaptive.forceOptimizeSkewedJoin
Is this expected? Are there any ideas for what we could do to reduce this skew?