管理筆記本Manage notebooks

您可以使用 UI、CLI 以及叫用工作區 API 來管理筆記本。You can manage notebooks using the UI, the CLI, and by invoking the Workspace API. 本文著重于使用 UI 來執行筆記本工作。This article focuses on performing notebook tasks using the UI. 如需其他方法,請參閱 DATABRICKS CLI工作區 APIFor the other methods, see Databricks CLI and Workspace API.

建立 NotebookCreate a notebook

  1. 按一下 [ 工作區 ] 按鈕  工作區圖示 或側邊欄中的 首頁 按鈕  首頁圖示 Click the Workspace button Workspace Icon or the Home button Home Icon in the sidebar. 執行下列其中一個動作:Do one of the following:
    • 在任何資料夾旁邊,按一下  文字右邊的功能表下拉式清單, 然後選取 [ 建立 > 筆記本]。Next to any folder, click the Menu Dropdown on the right side of the text and select Create > Notebook.

      建立筆記本Create notebook

    • 在工作區或使用者資料夾中,按一下 ![ 向下鍵插入], 然後選取 [ 建立 > 筆記本]。In the Workspace or a user folder, click Down Caret and select Create > Notebook.

  2. 在 [建立筆記本] 對話方塊中,輸入名稱並選取筆記本的預設語言。In the Create Notebook dialog, enter a name and select the notebook’s default language.
  3. 如果有執行中的叢集,則會 顯示 [叢集 ] 下拉式清單。If there are running clusters, the Cluster drop-down displays. 選取您要 附加 筆記本的叢集。Select the cluster you want to attach the notebook to.
  4. 按一下頁面底部的 [新增] 。Click Create.

開啟筆記本Open a notebook

在您的工作區中,按一下In your workspace, click a 筆記本圖示.. 當您將滑鼠停留在筆記本標題上時,就會顯示筆記本路徑。The notebook path displays when you hover over the notebook title.

刪除筆記本Delete a notebook

請參閱 資料夾工作區物件作業 ,以取得如何存取工作區功能表,以及刪除筆記本或工作區中其他專案的相關資訊。See Folders and Workspace object operations for information about how to access the workspace menu and delete notebooks or other items in the Workspace.

複製筆記本 路徑Copy notebook path

若要在不開啟筆記本的情況下複製筆記本檔案路徑,請以滑鼠右鍵按一下筆記本名稱,或按一下  筆記本名稱右邊的功能表下拉式清單 ,然後選取 [ 複製檔案路徑]。To copy a notebook file path without opening the notebook, right-click the notebook name or click the Menu Dropdown to the right of the notebook name and select Copy File Path.

複製筆記本 路徑Copy notebook path

重新命名筆記本Rename a notebook

若要變更開啟筆記本的標題,請按一下標題並編輯內嵌,或按一下 [檔案 > 重新命名]。To change the title of an open notebook, click the title and edit inline or click File > Rename.

控制筆記本的存取Control access to a notebook

如果您的 Azure Databricks 帳戶具有 Azure Databricks Premium 方案,則您可以使用 工作區存取控制 來控制可存取筆記本的人員。If your Azure Databricks account has the Azure Databricks Premium Plan, you can use Workspace access control to control who has access to a notebook.

筆記本外部格式 Notebook external formats

Azure Databricks 支援多種筆記本外部格式:Azure Databricks supports several notebook external formats:

  • 來源檔案:僅包含具有副檔名 .scala 、、或之原始程式碼語句的 .py 檔案 .sql .rSource file: A file containing only source code statements with the extension .scala, .py, .sql, or .r.
  • HTML:具有副檔名的 Azure Databricks 筆記本 .htmlHTML: An Azure Databricks notebook with the extension .html.
  • DBC archive: Databricks archiveDBC archive: A Databricks archive.
  • IPython 筆記本:具有擴充功能的 Jupyter 筆記本 .ipynbIPython notebook: A Jupyter notebook with the extension .ipynb.
  • RMarkdown:具有副檔名的 R Markdown 檔 .RmdRMarkdown: An R Markdown document with the extension .Rmd.

本節內容:In this section:

匯入筆記本 Import a notebook

您可以從 URL 或檔案匯入外部筆記本。You can import an external notebook from a URL or a file.

  1. 按一下 [ 工作區 ] 按鈕  工作區圖示 或側邊欄中的 首頁 按鈕  首頁圖示 Click the Workspace button Workspace Icon or the Home button Home Icon in the sidebar. 執行下列其中一個動作:Do one of the following:

    • 在任何資料夾旁邊,按一下  文字右邊的功能表下拉式清單, 然後選取 [匯 ]。Next to any folder, click the Menu Dropdown on the right side of the text and select Import.

    • 在工作區或使用者資料夾中,按一下 ![ 向下插入], 然後選取 [匯 ]。In the Workspace or a user folder, click Down Caret and select Import.

      匯入筆記本Import notebook

  2. 指定 URL,或流覽至包含支援之外部格式的檔案。Specify the URL or browse to a file containing a supported external format.

  3. 按一下 [匯入] 。Click Import.

匯出筆記本 Export a notebook

在筆記本工具列中,選取 [ File > Export ] 和 [ format]。In the notebook toolbar, select File > Export and a format.

注意

當您將筆記本匯出為 HTML、IPython 筆記本或 archive (DBC) ,而且您尚未 清除 結果時,會包含執行筆記本的結果。When you export a notebook as HTML, IPython notebook, or archive (DBC), and you have not cleared the results, the results of running the notebook are included.

筆記本和叢集Notebooks and clusters

在筆記本中執行任何工作之前,您必須先將筆記本連接到叢集。Before you can do any work in a notebook, you must first attach the notebook to a cluster. 本節說明如何在叢集之間附加和卸離筆記本,以及當您執行這些動作時,會在幕後發生什麼事。This section describes how to attach and detach notebooks to and from clusters and what happens behind the scenes when you perform these actions.

本節內容:In this section:

執行內容 Execution contexts

當您將筆記本附加至叢集時,Azure Databricks 會建立執行內容。When you attach a notebook to a cluster, Azure Databricks creates an execution context. 執行內容 會針對每個支援的程式設計語言(Python、R、SCALA 和 SQL)包含 複寫環境的狀態。An execution context contains the state for a REPL environment for each supported programming language: Python, R, Scala, and SQL. 當您在筆記本中執行資料格時,會將命令分派至適當的語言複製環境並執行。When you run a cell in a notebook, the command is dispatched to the appropriate language REPL environment and run.

您也可以使用 REST 1.2 API 來建立執行內容,並傳送要在執行內容中執行的命令。You can also use the REST 1.2 API to create an execution context and send a command to run in the execution context. 同樣地,此命令會分派至語言複製環境並執行。Similarly, the command is dispatched to the language REPL environment and run.

叢集有 (145) 的最大執行內容數目。A cluster has a maximum number of execution contexts (145). 一旦執行內容的數目達到此閾值,您就無法將筆記本附加至叢集或建立新的執行內容。Once the number of execution contexts has reached this threshold, you cannot attach a notebook to the cluster or create a new execution context.

閒置執行內容Idle execution contexts

當最後一次完成的執行超過設定的閒置閾值時,就會將執行內容視為 閒置An execution context is considered idle when the last completed execution occurred past a set idle threshold. 上次完成的執行是筆記本完成執行命令的最後時間。Last completed execution is the last time the notebook completed execution of commands. 閒置閾值是在最後一個完成的執行與任何自動卸離筆記本的嘗試之間必須經過的時間量。The idle threshold is the amount of time that must pass between the last completed execution and any attempt to automatically detach the notebook. 預設閒置閾值為24小時。The default idle threshold is 24 hours.

當叢集已達最大內容限制時,Azure Databricks 會 (收回) 閒置執行內容中移除, (視需要最少使用的) 開始。When a cluster has reached the maximum context limit, Azure Databricks removes (evicts) idle execution contexts (starting with the least recently used) as needed. 即使已移除內容,使用內容的筆記本 仍會附加至叢集,並顯示在叢集的筆記本清單中Even when a context is removed, the notebook using the context is still attached to the cluster and appears in the cluster’s notebook list. 串流處理筆記本會視為主動執行,而且其內容永遠不會收回,直到停止執行為止。Streaming notebooks are considered actively running, and their context is never evicted until their execution has been stopped. 如果閒置內容已收回,UI 會顯示一則訊息,指出使用內容的筆記本因為閒置而卸離。If an idle context is evicted, the UI displays a message indicating that the notebook using the context was detached due to being idle.

筆記本內容收回Notebook context evicted

如果您嘗試將筆記本附加至具有最大執行內容數目的叢集,而且沒有閒置的內容 (或停用自動收回) ,則 UI 會顯示訊息,指出已達到目前的最大執行內容閾值,且筆記本會保持卸離狀態。If you attempt to attach a notebook to cluster that has maximum number of execution contexts and there are no idle contexts (or if auto-eviction is disabled), the UI displays a message saying that the current maximum execution contexts threshold has been reached and the notebook will remain in the detached state.

筆記本已卸離Notebook detached

如果您將進程派生,則在執行分支進程的要求後,閒置執行內容仍會被視為閒置。If you fork a process, an idle execution context is still considered idle once execution of the request that forked the process returns. 不建議 使用 Spark 來分叉個別的進程。Forking separate processes is not recommended with Spark.

設定內容自動收回 Configure context auto-eviction

您可以藉由設定 Spark 屬性來設定內容自動收回 spark.databricks.chauffeur.enableIdleContextTrackingYou can configure context auto-eviction by setting the Spark property spark.databricks.chauffeur.enableIdleContextTracking.

  • 在 Databricks 5.0 和更新版本中,預設會啟用自動收回。In Databricks 5.0 and above, auto-eviction is enabled by default. 您可以藉由設定來停用叢集的自動收回 spark.databricks.chauffeur.enableIdleContextTracking falseYou disable auto-eviction for a cluster by setting spark.databricks.chauffeur.enableIdleContextTracking false.
  • 在 Databricks 4.3 中,預設會停用自動收回。In Databricks 4.3, auto-eviction is disabled by default. 您可以藉由設定來啟用叢集的自動收回 spark.databricks.chauffeur.enableIdleContextTracking trueYou enable auto-eviction for a cluster by setting spark.databricks.chauffeur.enableIdleContextTracking true.

將筆記本附加 至叢集 Attach a notebook to a cluster

若要將筆記本附加至叢集:To attach a notebook to a cluster:

  1. 在筆記本工具列中,按一下 叢集  圖示卸 叢集 ![ ] 下拉式清單 In the notebook toolbar, click Clusters Icon Detached Cluster Dropdown.
  2. 從下拉式清單中選取叢集。From the drop-down, select a cluster.

重要

附加的筆記本已定義下列 Apache Spark 變數。An attached notebook has the following Apache Spark variables defined.

類別Class 變數名稱Variable Name
SparkContext sc
SQLContext/HiveContext sqlContext
SparkSession (Spark 2.x) SparkSession (Spark 2.x) spark

請勿建立 SparkSessionSparkContextSQLContextDo not create a SparkSession, SparkContext, or SQLContext. 這麼做會導致不一致的行為。Doing so will lead to inconsistent behavior.

判斷 Spark 和 Databricks Runtime 版本 Determine Spark and Databricks Runtime version

若要判斷您的筆記本所連接之叢集的 Spark 版本,請執行:To determine the Spark version of the cluster your notebook is attached to, run:

spark.version

若要判斷您的筆記本所連接的叢集 Databricks Runtime 版本,請執行:To determine the Databricks Runtime version of the cluster your notebook is attached to, run:

ScalaScala
dbutils.notebook.getContext.tags("sparkVersion")
PythonPython
spark.conf.get("spark.databricks.clusterUsageTags.sparkVersion")

注意

叢集 sparkVersion spark_version API作業 API 中的端點所需的這個標記和屬性都會參考 Databricks Runtime 版本,而不是 Spark 版本。Both this sparkVersion tag and the spark_version property required by the endpoints in the Clusters API and Jobs API refer to the Databricks Runtime version, not the Spark version.

從叢集中卸離筆記本 Detach a notebook from a cluster

  1. 在筆記本工具列中,按一下 叢集 ![ ] <叢集>叢集下拉式清單中 附加的 叢集圖示  In the notebook toolbar, click Clusters Icon Attached Cluster Dropdown.

  2. 選取 [卸離]。Select Detach.

    卸離筆記本Detach notebook

您也可以使用 [叢集詳細資料] 頁面上的 [ 筆記本 ] 索引標籤,從叢集卸離筆記本。You can also detach notebooks from a cluster using the Notebooks tab on the cluster details page.

當您從叢集卸離筆記本時,會移除 執行內容 ,並從筆記本清除所有計算的變數值。When you detach a notebook from a cluster, the execution context is removed and all computed variable values are cleared from the notebook.

提示

Azure Databricks 建議您從叢集中卸離未使用的筆記本。Azure Databricks recommends that you detach unused notebooks from a cluster. 這會釋出驅動程式上的記憶體空間。This frees up memory space on the driver.

查看所有附加至叢集的筆記本View all notebooks attached to a cluster

[叢集詳細資料] 頁面上的 [ 筆記本 ] 索引標籤會顯示所有連接到叢集的筆記本。The Notebooks tab on the cluster details page displays all of the notebooks that are attached to a cluster. 此索引標籤也會顯示每個附加筆記本的狀態,以及上次從筆記本執行命令的時間。The tab also displays the status of each attached notebook, along with the last time a command was run from the notebook.

叢集詳細資料附加的筆記本Cluster details attached notebooks

排程筆記本 Schedule a notebook

若要排程要定期執行的筆記本作業:To schedule a notebook job to run periodically:

  1. 在筆記本工具列中,按一下In the notebook toolbar, click the 排程圖示 按鈕。button at the top right.
  2. 按一下 [+ 新增]。Click + New.
  3. 選擇排程。Choose the schedule.
  4. 按一下 [確定]。Click OK.

散發筆記本 Distribute notebooks

為了讓您輕鬆地散發 Azure Databricks 筆記本,Azure Databricks 支援 Databricks 封存,也就是可以包含筆記本資料夾或單一筆記本的套件。To allow you to easily distribute Azure Databricks notebooks, Azure Databricks supports the Databricks archive, which is a package that can contain a folder of notebooks or a single notebook. Databricks 封存是具有額外中繼資料和副檔名的 JAR 檔案 .dbcA Databricks archive is a JAR file with extra metadata and has the extension .dbc. 封存中包含的筆記本是 Azure Databricks 的內部格式。The notebooks contained in the archive are in an Azure Databricks internal format.

匯入封存Import an archive

  1. 按一下   資料夾或筆記本右邊的插入號或功能表下拉式清單 ,然後選取 [匯 ]。Click Down Caret or Menu Dropdown to the right of a folder or notebook and select Import.
  2. 選擇 [ 檔案] 或 [ URL]。Choose File or URL.
  3. 移至或捨棄 dropzone 中的 Databricks 封存。Go to or drop a Databricks archive in the dropzone.
  4. 按一下 [匯入] 。Click Import. 封存會匯入 Azure Databricks。The archive is imported into Azure Databricks. 如果封存包含資料夾,Azure Databricks 會重新建立該資料夾。If the archive contains a folder, Azure Databricks recreates that folder.

匯出封存Export an archive

按一下   資料夾或筆記本右邊的插入號或功能表下拉式清單 ,然後選取 [ 匯出 > DBC Archive]。Click Down Caret or Menu Dropdown to the right of a folder or notebook and select Export > DBC Archive. Azure Databricks 下載名為的檔案 <[folder|notebook]-name>.dbcAzure Databricks downloads a file named <[folder|notebook]-name>.dbc.