Troubleshooting Match Phase Job Stuck on 1.2M Dataset in Docker with Executor Memory Increased to 16G

·Jun 22, 2026 04:53 AM·

Hi Zingg Team I am trying to run match phase for a 1.2 M dataset in my local machine using docker. But the job is struck post the log,

26/06/22 04:44:33 WARN InstanceBuilder: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS

Tried increasing the executor memory to 16G and keeping driver memory to 4G, but did not help much. Other phases - label, train is completed. Any help would be appreciated.

26/06/22 04:56:12 INFO Client: 26/06/22 04:56:12 INFO Trainer: Reading inputs for training phase ... 26/06/22 04:56:12 INFO Trainer: Initializing learning similarity rules 26/06/22 04:56:12 WARN PipeUtilReader: Reading Pipe [name=null, format=parquet, preprocessors=null, props={path=/tmp/zingg_dir/app_models/100/trainingData//marked/}] 26/06/22 04:56:14 WARN DSUtil: Read marked training samples 26/06/22 04:56:14 WARN DSUtil: No configured training samples 26/06/22 04:56:15 WARN BlockManager: Block rdd_12_0 already exists on this machine; not re-adding it 26/06/22 04:56:15 WARN Trainer: Training on positive pairs - 23 26/06/22 04:56:15 WARN Trainer: Training on negative pairs - 58 26/06/22 04:56:15 WARN PipeUtilReader: Reading Pipe [name=app, format=csv, preprocessors=null, props={path=/tmp/zingg_dir/nobids_apps.csv, header=false, delimiter=,}] 26/06/22 04:56:19 INFO Heuristics: **Block size **100 and total count was 279411 26/06/22 04:56:19 INFO Heuristics: Heuristics suggest 100 26/06/22 04:56:19 INFO BlockingTreeUtil: Learning indexing rules for block size 100 26/06/22 04:56:20 WARN PipeUtilWriter: Writing output Pipe [name=null, format=parquet, preprocessors=null, props={path=/tmp/zingg_dir/app_models/100/model/block/zingg.block}] 26/06/22 04:56:20 WARN TaskSetManager: Stage 26 contains a task of very large size (1193 KiB). The maximum recommended task size is 1000 KiB. 26/06/22 04:56:20 INFO Trainer: Learnt indexing rules and saved output at /tmp/zingg_dir/app_models 26/06/22 04:56:20 INFO ModelUtil: Learning similarity rules 26/06/22 04:56:20 INFO ModelUtil: Start reading internal configurations and functions 26/06/22 04:56:20 INFO ModelUtil: Finished reading internal configurations and functions 26/06/22 04:56:22 WARN InstanceBuilder: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS 26/06/22 04:56:43 INFO Trainer: Learnt similarity rules and saved output at /tmp/zingg_dir/app_models 26/06/22 04:56:43 INFO Trainer: Finished Learning phase 26/06/22 04:56:43 WARN PipeUtilReader: Reading Pipe [name=app, format=csv, preprocessors=null, props={path=/tmp/zingg_dir/nobids_apps.csv, header=false, delimiter=,}] 26/06/22 04:56:47 INFO Matcher: Read 1116587 26/06/22 04:56:47 WARN Blocker: Blocking model location is Pipe [name=null, format=parquet, preprocessors=null, props={path=/tmp/zingg_dir/app_models/100/model/block/zingg.block}] 26/06/22 04:56:47 WARN PipeUtilReader: Reading Pipe [name=null, format=parquet, preprocessors=null, props={path=/tmp/zingg_dir/app_models/100/model/block/zingg.block}] 26/06/22 04:56:47 INFO Matcher: Blocked 26/06/22 04:56:48 INFO SparkModel: threshold while predicting is 0.5 26/06/22 04:56:48 WARN CacheManager: Asked to cache already cached data. 26/06/22 04:56:48 WARN DAGScheduler: Broadcasting large task binary with size 1229.9 KiB

9 comments