Hi, i have created the new lake-house and while creating lake house , i have select the "Lakehouse schema " and here is the Environment Used
Fabric Spark Runtime: 1.3
Zingg 0.5.0.jar
Zingg 0.5.0
Tabulate 0.9
still i am getting error on this section options = ClientOptions([ClientOptions.PHASE,"findTrainingData"]) #Zingg execution for the given phase zingg = ZinggWithSpark(args, options) print(args) print(options) print(zingg) zingg.initAndExecute()
you have to uncheck that box
but the link says to not select the Lakehouse Schema option.
The previous one ... i have uncheck the box...
i have create 2 fabric lakehouse ....1) uncheck the box 2) check the box....both are not working...
Can you please follow the exact steps as per the link in a new session? 3 of us had this issue, and following the steps helped us. If you still face an issue, maybe there is something else going on which we will have to investigate. So please log an issue with the details on how to reproduce
the previous issues resloved it....now i am getting different error ...
is it possible to give demo on Identify resolution
I am sorry the shared image does not have the error log.
['--phase', 'trainMatch'] arguments for client options are ['--phase', 'trainMatch', '--license', 'zinggLic.txt', '--email', 'zingg@zingg.ai', '--conf', 'dummyConf.json'] --------------------------------------------------------------------------- Py4JJavaError Traceback (most recent call last) Cell In[70], line 5 3 #Zingg execution for the given phase 4 zingg = ZinggWithSpark(args, options) ----> 5 zingg.initAndExecute() File ~/cluster-env/trident_env/lib/python3.11/site-packages/zingg/client.py:280, in Zingg.initAndExecute(self) 278 self.client.execute() 279 else: --> 280 self.client.execute() File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/java_gateway.py:1322, in __call__(self, *args) 1317 def __init__(self, target_id, gateway_client): 1318 """ 1319 :param target_id: the identifier of the object on the JVM side. Given 1320 by the JVM. 1321 -> 1322 :param gateway_client: the gateway client used to communicate with 1323 the JVM. 1324 """ 1325 self._target_id = target_id 1326 self._gateway_client = gateway_client File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:179, in capture_sql_exception.<locals>.deco(*a, **kw) 177 def deco(*a: Any, **kw: Any) -> Any: 178 try: --> 179 return f(*a, **kw) 180 except Py4JJavaError as e: 181 converted = convert_exception(e.java_exception) File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name) 324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) 325 if answer[1] == REFERENCE_TYPE: --> 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError( 331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n". 332 format(target_id, ".", name, value)) Py4JJavaError: An error occurred while calling o9122.execute. : zingg.common.client.ZinggClientException: Unable to train as insufficient training data found. Training data has 0 matches and 0 non matches. Please run findTrainingData and label till you have sufficient labelled data to build the models at zingg.common.core.executor.Trainer.verifyTraining(Trainer.java:78) at zingg.common.core.executor.Trainer.execute(Trainer.java:37) at zingg.common.core.executor.TrainMatcher.execute(TrainMatcher.java:33) at zingg.common.client.Client.execute(Client.java:281) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.base/java.lang.Thread.run(Thread.java:829)
Training data has 0 matches and 0 non matches. Please run findTrainingData and label till you have sufficient labelled data to build the modelsYou need to label few data as match or non match. please run findTrainingData cell and label etc before running trainMatch
You have accumulated 0 pairs labeled as positive matches. You have accumulated 0 pairs labeled as not matches. If you need more pairs to label, re-run the cell for 'findTrainingData'
i am seeing the report match/no match /uncerttian
it is not report. there you have to click those button for each pair of records as to mark them as match/no match /uncertain
i did that...what next