I believe this is more of a spark/distributed processing thing.
If zingg were to provide sequential ids, it would have to collect all of the IDs back to a single process and increment the IDs... i.e. how could 10 independent processes all be completely aware of what count the other process was on?
collecting back to a single process is not scalable at large volume. So, in spark, like in other parallel processing engines, we fall back to a "not-sequential-but-still-unique" strategy. Something like the monotonically_increasing_id() function in spark
I think zingg uses graphframes connected components in it's last step, which uses a technique like this