由于库冲突,群集将取消 Python 命令执行 Cluster cancels Python command execution due to library conflict

问题Problem

群集 Cancelled 在 Python 笔记本中返回。The cluster returns Cancelled in a Python notebook. 所有其他语言的笔记本在同一群集上成功执行。Notebooks in all other languages execute successfully on the same cluster.

原因Cause

当你将版本的库(如 ipython 、、 ipywidgets 、或)安装 numpy 到时 scipy pandas PYTHONPATH ,Python 复制可能会中断,并使所有命令 Cancelled 在30秒后返回。When you install a conflicting version of a library, such as ipython, ipywidgets, numpy, scipy, or pandas to the PYTHONPATH, then the Python REPL can break, causing all commands to return Cancelled after 30 seconds. 这也会打破% sh,这会使你能够在 Python 笔记本单元中输入 shell 脚本。This also breaks %sh, the notebook macro that lets you enter shell scripts in Python notebook cells.

备注

解决方案Solution

若要解决此问题,请执行以下操作:To solve this problem, do the following:

  1. 确定发生冲突的库,并将其卸载。Identify the conflicting library and uninstall it.
  2. 在笔记本中或使用群集范围内的初始化脚本安装库的正确版本。Install the correct version of the library in a notebook or with a cluster-scoped init script.

标识冲突库Identify the conflicting library

  1. 一次卸载一个库,并检查 Python 复制是否仍中断。Uninstall each library one at a time, and check if the Python REPL still breaks.
  2. 如果复制仍中断,请重新安装已删除的库,然后删除下一个库。If the REPL still breaks, reinstall the library you removed and remove the next one.
  3. 找到导致复制中断的库时,请使用下面两种方法之一安装该库的正确版本。When you find the library that causes the REPL to break, install the correct version of that library using one of the two methods below.

你还可以检查 std.err 群集(位于 "群集配置" 页上)的驱动程序日志(),查看堆栈跟踪和错误消息,从而帮助确定库冲突。You can also inspect the driver log (std.err) for the cluster (on the Cluster Configuration page) for a stack trace and error message that can help identify the library conflict.

安装正确的库Install the correct library

执行下列操作之一:Do one of the following.

选项1:使用 pip3 在笔记本中安装Option 1: Install in a notebook using pip3

%sh sudo apt-get -y install python3-pip
  pip3 install <library-name>

选项2:使用群集范围内的初始化脚本安装Option 2: Install using a cluster-scoped init script

按照以下步骤创建群集范围的初始化脚本,该脚本将安装库的正确版本。Follow the steps below to create a cluster-scoped init script that installs the correct version of the library. <library-name>将示例中的替换为要安装的库的文件名。Replace <library-name> in the examples with the filename of the library to install.

  1. 如果初始化脚本尚不存在,请创建一个基目录来存储它:If the init script does not already exist, create a base directory to store it:

    dbutils.fs.mkdirs("dbfs:/databricks/<directory>/")
    
  2. 创建以下脚本:Create the following script:

    dbutils.fs.put("/databricks/init/cluster-name/<library-name>.sh","""
     #!/bin/bash
     sudo apt-get -y install python3-pip
     sudo pip3 install <library-name>
     """, True)
    
  3. 确认脚本存在:Confirm that the script exists:

    display(dbutils.fs.ls("dbfs:/databricks/<directory>/<library-name>.sh"))
    
  4. 请在 "群集配置" 页上,单击 "高级选项" 切换。Go to the cluster configuration page and click the Advanced Options toggle.

  5. 在页面底部,单击 "初始化脚本" 选项卡:At the bottom of the page, click the Init Scripts tab:

    no-alternative-textno-alternative-text

  6. 在 "目标" 下拉箭头中,选择 " DBFS",提供脚本的文件路径,然后单击 "添加"。In the Destination drop-down, select DBFS, provide the file path to the script, and click Add.

  7. 重新启动群集。Restart the cluster.