程式庫Libraries

若要讓協力廠商或自訂程式碼可供您叢集上執行的筆記本和工作使用,您可以安裝程式庫。To make third-party or custom code available to notebooks and jobs running on your clusters, you can install a library. 程式庫可以用 Python、JAVA、Scala 和 R 撰寫。您可以上傳 JAVA、Scala 和 Python 程式庫,並指向 PyPI、Maven 和 CRAN 存放庫中的外部套件。Libraries can be written in Python, Java, Scala, and R. You can upload Java, Scala, and Python libraries and point to external packages in PyPI, Maven, and CRAN repositories.

本文著重於在工作區 UI 中執行程式庫工作。This article focuses on performing library tasks in the workspace UI. 您也可以使用 程式庫 CLI程式庫 API來管理程式庫。You can also manage libraries using the Libraries CLI or the Libraries API.

提示

Databricks 在 Databricks Runtime 中包含許多通用程式庫。Databricks includes many common libraries in Databricks Runtime. 若要查看包含在 Databricks Runtime 中的程式庫,請查看 Databricks Runtime 版本資訊系統環境 小節,以了解您的 Databricks Runtime 版本。To see which libraries are included in Databricks Runtime, look at the System Environment subsection of the Databricks Runtime release notes for your Databricks Runtime version.

注意

Microsoft 支援服務有助於找出並解決與 Azure Databricks 所安裝和維護的程式庫相關的問題。Microsoft Support helps isolate and resolve issues related to libraries installed and maintained by Azure Databricks. 對於包括程式庫在內的第三方元件,Microsoft 提供了符合商業原則的支援,以協助您進一步針對問題進行疑難排解。For third-party components, including libraries, Microsoft provides commercially reasonable support to help you further troubleshoot issues. Microsoft 支援服務會盡全力提供協助,或許能幫助您解決問題。Microsoft Support assists on a best-effort basis and might be able to resolve the issue. 針對 Github 上託管的開放原始碼連接器和專案,我們建議您在 Github 上提出問題並追蹤。For open source connectors and projects hosted on Github, we recommend that you file issues on Github and follow up on them. 透過標準支援案例提交程式,不支援像是陰影 jar 或建立 Python 程式庫等開發工作:他們需要諮詢互動,以加快解決速度。Development efforts such as shading jars or building Python libraries are not supported through the standard support case submission process: they require a consulting engagement for faster resolution. 支援可能會要求您接觸其他管道,以獲得開放原始碼技術並於其中找到該技術的深度專業知識。Support might ask you to engage other channels for open-source technologies where you can find deep expertise for that technology. 這樣的社群網站有好幾個;其中的兩個範例是 Azure Databricks 的 Microsoft Q&A 頁面Stack OverflowThere are several community sites; two examples are the Microsoft Q&A page for Azure Databricks and Stack Overflow.

您可以將程式庫安裝為三種模式:工作區、叢集安裝和筆記本範圍。You can install libraries in three modes: workspace, cluster-installed, and notebook-scoped.

  • 工作區程式庫做為本機存放庫,以供您用來建立叢集安裝的程式庫。Workspace libraries serve as a local repository from which you create cluster-installed libraries. 工作區程式庫可能是您組織所建立的自訂程式碼,也可能是您組織已標準化的特定開放原始碼程式庫版本。A workspace library might be custom code created by your organization, or might be a particular version of an open-source library that your organization has standardized on.

  • 叢集程式庫可供在叢集上執行的所有筆記本使用。Cluster libraries can be used by all notebooks running on a cluster. 您可以直接從公用存放庫 (例如 PyPI 或 Maven) 安裝叢集程式庫,或從先前安裝的工作區程式庫建立叢集程式庫。You can install a cluster library directly from a public repository such as PyPI or Maven, or create one from a previously installed workspace library.

  • 筆記本範圍的 Python 程式庫可讓您安裝 Python 程式庫,並建立範圍設定為筆記本工作階段的環境。Notebook-scoped Python libraries allow you to install Python libraries and create an environment scoped to a notebook session. 筆記本範圍的程式庫不會影響在相同叢集上執行的其他筆記本。Notebook-scoped libraries do not affect other notebooks running on the same cluster. 這些程式庫不會保存,而且必須針對每個工作階段重新安裝。These libraries do not persist and must be re-installed for each session.

    當您需要特定筆記本的自訂 Python 環境時,請使用筆記本範圍的程式庫。Use notebook-scoped libraries when you need a custom Python environment for a specific notebook. 使用筆記本範圍的程式庫,您也可以建立、修改、儲存、重複使用及共用 Python 環境。With notebook-scoped libraries, you can create, modify, save, reuse, and share Python environments.

    • 筆記本範圍的程式庫可使用 %pip 來取得,以及 Databricks Runtime ML 6.4 和更新版本中的 %conda magic 命令,以及使用 Databricks Runtime 7.1 和更新版本中的 %pip magic 命令。Notebook-scoped libraries are available using %pip and %conda magic commands in Databricks Runtime ML 6.4 and above and using %pip magic commands in Databricks Runtime 7.1 and above. 請參閱筆記本範圍的 Python 程式庫See Notebook-scoped Python libraries.
    • 筆記本範圍的程式庫也可以使用程式庫公用程式,但其與 %pip 不相容 (%pip 建議用於所有新的工作負載)。Notebook-scoped libraries are also available using library utilities, although they are incompatible with %pip (%pip is recommended for all new workloads). 請參閱程式庫公用程式See Library utilities.

本節涵蓋︰This section covers:

Python 環境管理Python environment management

下表提供可用來在 Azure Databricks 中安裝 Python 程式庫的選項概觀。The following table provides an overview of options you can use to install Python libraries in Azure Databricks. 使用 %conda 安裝筆記本範圍的程式庫類似於 %pip,不同之處在於您必須使用 conda 選項 --channel取代 pip 選項 --index-urlInstalling notebook-scoped libraries with %conda is similar to %pip, except that you must replace the pip option --index-url with the conda option --channel.

注意

  • 使用魔術命令的筆記本範圍程式庫預設會在 Databricks Runtime 7.1 和更新版本、Databricks Runtime 7.1 ML 和更新版本,以及適用於 Genomics 的 Databricks Runtime 7.1 和更新版本中啟用。Notebook-scoped libraries using magic commands are enabled by default in Databricks Runtime 7.1 and above, Databricks Runtime 7.1 ML and above, and Databricks Runtime 7.1 for Genomics and above. 您也可以使用 Databricks Runtime 6.4 ML 至 7.0 ML 中的組態設定,以及使用適用於 Genomics 的 Databricks Runtime 6.4 至適用於 Genomics 的 Databricks Runtime 7.0。They are also available using a configuration setting in Databricks Runtime 6.4 ML to 7.0 ML and Databricks Runtime 6.4 for Genomics to Databricks Runtime 7.0 for Genomics. 如需詳細資訊,請參閱需求See Requirements for details.
  • 具有程式庫公用程式的筆記本範圍,僅適用於 Databricks Runtime。Notebook-scoped libraries with library utilities are available in Databricks Runtime only. 其無法在 Databricks Runtime ML 或適用於 Databricks Runtime 的 Genomics 時使用。They are not available on Databricks Runtime ML or Databricks Runtime for Genomics.
Python 套件來源Python package source 具有 %pip 的筆記本範圍程式庫Notebook-scoped libraries with %pip 具有程式庫公用程式的筆記本範圍程式庫Notebook-scoped libraries with library utilities 叢集程式庫Cluster libraries 具有作業 API 的作業程式庫Job libraries with Jobs API
PyPIPyPI 請使用 %pip installUse %pip install. 請參閱範例See example. 使用dbutils.libraryUse dbutils.library
.installPyPI..installPyPI.
選取 [PyPI] 作為來源。Select PyPI as the source. 請參閱文件See documentation. 將新的 pypi 物件新增至作業程式庫,並指定 [package] 欄位。Add a new pypi object to the job libraries and specify the package field. 請參閱文件See documentation.
私人 PyPI 鏡像,例如 Nexus 或 ArtifactoryPrivate PyPI mirror, such as Nexus or Artifactory 使用 %pip install--index-url 選項。Use %pip install with the --index-url option. 秘密管理可供使用。Secret management is available. 請參閱範例See example. 使用dbutils.libraryUse dbutils.library
.installPyPI 並指定 repo 引數。.installPyPI and specify the repo argument.
不支援。Not supported. 不支援。Not supported.
具有原始來源的 VCS (例如 GitHub)VCS, such as GitHub, with raw source 使用 %pip install 並指定存放庫 URL 做為套件名稱。Use %pip install and specify the repository URL as the package name. 請參閱範例See example. 不支援。Not supported. 選取 [PyPI] 作為來源,並將 [存放庫 URL] 指定為套件名稱。Select PyPI as the source and specify the repository URL as the package name. 請參閱文件See documentation. 將新的 pypi 物件新增至作業程式庫,並將 [存放庫 URL] 指定為 [package] 欄位。Add a new pypi object to the job libraries and specify the repository URL as the package field. 請參閱文件See documentation.
具有原始來源的私人 VCSPrivate VCS with raw source 使用 %pip install,並以基本驗證指定存放庫 URL 做為套件名稱。Use %pip install and specify the repository URL with basic authentication as the package name. 秘密管理可供使用。Secret management is available. 請參閱範例See example. 不支援。Not supported. 不支援。Not supported. 不支援。Not supported.
DBFSDBFS 請使用 %pip installUse %pip install. 請參閱範例See example. 使用dbutils.libraryUse dbutils.library
.install(dbfs_path)..install(dbfs_path).
選取 [DBFS] 作為來源。Select DBFS as the source. 請參閱文件See documentation. 將新的 eggwhl 物件新增至作業程式庫,並指定 DBFS 路徑作為 package 欄位。Add a new egg or whl object to the job libraries and specify the DBFS path as the package field. 請參閱文件See documentation.