高平行存取叢集上的 Python 命令失敗Python commands fail on high concurrency clusters

問題Problem

您正嘗試在高平行存取叢集上執行 Python 命令。You are attempting to run Python commands on a high concurrency cluster.

所有 Python 命令都失敗,並出現 WARN 錯誤訊息。All Python commands fail with a WARN error message.

WARN PythonDriverWrapper: Failed to start repl ReplId-61bef-9fc33-1f8f6-2
ExitCodeException exitCode=1: chown: invalid user: ‘spark-9fcdf4d2-045d-4f3b-9293-0f’

原因Cause

spark.databricks.pyspark.enableProcessIsolation truespark.databricks.session.share true 都是在叢集的 Apache Spark 設定中設定。Both spark.databricks.pyspark.enableProcessIsolation true and spark.databricks.session.share true are set in the Apache Spark configuration on the cluster.

這兩個 Spark 屬性會彼此衝突,並防止叢集執行 Python 命令。These two Spark properties conflict with each other and prevent the cluster from running Python commands.

解決方案Solution

您一次只能在叢集上啟用這兩個 Spark 屬性的其中一個。You can only have one of these two Spark properties enabled on your cluster at a time.

您必須根據您的需求選擇進程隔離或 Spark 共用會話。You must choose process isolation or a Spark shared session based on your needs. 停用 [其他] 選項。Disable the other option.