2019 年 4 月April 2019

2019年4月發行了這些功能和 Azure Databricks 平臺改進。These features and Azure Databricks platform improvements were released in April 2019.

注意

發行是暫存的。Releases are staged. 在初始發行日期之後,您的 Azure Databricks 帳戶可能不會更新到一周。Your Azure Databricks account may not be updated until up to a week after the initial release date.

Azure Databricks 上的 MLflow (正式推出)MLflow on Azure Databricks (GA)

2019年4月25日April 25, 2019

Azure Databricks 的受控 MLflow 現已正式推出。Managed MLflow on Azure Databricks is now generally available. MLflow on Azure Databricks 提供 MLflow 的託管版本,與 Databricks 安全性模型和互動式工作區完全整合。MLflow on Azure Databricks offers a hosted version of MLflow fully integrated with the Databricks security model and interactive workspace. 請參閱 MLflow 指南See MLflow guide.

Azure Databricks 上的 Delta LakeDelta Lake on Azure Databricks

2019 年 4 月 24 日April 24, 2019

Databricks 已開啟 Delta Lake 專案的原始來源。Databricks has open sourced the Delta Lake project. Delta Lake 是一種儲存層,可透過寫入和快照隔離之間的開放式並行存取控制來提供 ACID 交易,以便在寫入期間進行一致的讀取,以提供以 HDFS 和雲端儲存體為基礎之資料 lake 的可靠性。Delta Lake is a storage layer that brings reliability to data lakes built on HDFS and cloud storage by providing ACID transactions through optimistic concurrency control between writes and snapshot isolation for consistent reads during writes. Delta Lake 也提供內建的資料版本控制功能,可讓您輕鬆地復原和重新產生報表。Delta Lake also provides built-in data versioning for easy rollbacks and reproducing reports.

注意

先前呼叫的 Databricks Delta 現在是 Delta Lake 開放原始碼專案,加上 Azure Databricks 可用的優化。What was previously called Databricks Delta is now the Delta Lake open source project plus optimizations available on Azure Databricks. 請參閱 Delta Lake 和差異引擎指南See Delta Lake and Delta Engine guide.

MLflow 執行提要欄位MLflow runs sidebar

9-16 年4月,2019:版本2.95April 9 - 16, 2019: Version 2.95

您現在可以在筆記本旁的提要欄位中,看到 MLflow 執行和產生這些執行的筆記本修訂。You can now view the MLflow runs and the notebook revisions that produced these runs in a sidebar next to your notebook.

MLflow 在筆記本提要欄位中執行MLflow runs in notebook sidebar

查看 筆記本實驗See Notebook experiments.

使用 Azure AD 認證自動存取 Azure Data Lake Storage Gen1 和 Gen2 (正式推出)Access Azure Data Lake Storage Gen1 and Gen2 automatically with your Azure AD credentials (GA)

9-16 年4月,2019:版本2.95April 9 - 16, 2019: Version 2.95

我們很高興地宣佈,使用相同的 Azure Active Directory (Azure AD) 用來登入 Azure Databricks 的身分識別,從 Azure Databricks 叢集 Azure Data Lake Storage Gen1 和 Gen2 的一般可用性。We are pleased to announce the general availability of automatic authentication to Azure Data Lake Storage Gen1 and Gen2 from Azure Databricks clusters using the same Azure Active Directory (Azure AD) identity that you use to log into Azure Databricks.

只要為您的叢集啟用 Azure AD 認證傳遞,您在該叢集上執行的命令將能夠在 Azure Data Lake Storage Gen1 和 Gen2 中讀取和寫入資料,而不需要您設定服務主體認證來存取儲存體。Simply enable your cluster for Azure AD credential passthrough, and commands that you run on that cluster will be able to read and write your data in Azure Data Lake Storage Gen1 and Gen2 without requiring you to configure service principal credentials for access to storage.

如需詳細資訊,請參閱 使用 Azure Active Directory 認證傳遞存取 Azure Data Lake Storage 的安全存取For more information, see Secure access to Azure Data Lake Storage using Azure Active Directory credential passthrough.

Databricks Runtime 5.3 (正式推出)Databricks Runtime 5.3 (GA)

2019年4月3日April 3, 2019

Databricks Runtime 5.3 現已正式推出。Databricks Runtime 5.3 is now generally available. Databricks Runtime 5.3 包括新的 Delta Lake 功能和升級,以及升級的 Python、R、JAVA 和 Scala 程式庫。Databricks Runtime 5.3 includes new Delta Lake features and upgrades, and upgraded Python, R, Java, and Scala libraries.

主要升級包括:Major upgrades include:

  • Databricks Delta time 旅遊 GADatabricks Delta time travel GA
  • MySQL 資料表複寫至差異,公開預覽MySQL table replication to Delta, Public Preview
  • 針對深度學習工作負載優化的 DBFS 保險絲資料夾Optimized DBFS FUSE folder for deep learning workloads
  • 筆記本範圍的程式庫改善Notebook-scoped library improvements
  • 新的 Databricks Advisor 提示New Databricks Advisor hints

如需詳細資訊,請參閱 Databricks Runtime 5.3 (不支援的) For details, see Databricks Runtime 5.3 (Unsupported).

Databricks Runtime 5.3 ML (正式推出)Databricks Runtime 5.3 ML (GA)

2019年4月3日April 3, 2019

有了 Databricks Runtime 5.3 的 Machine Learning,我們已經實現了 Databricks Runtime ML 的第一 GA!With Databricks Runtime 5.3 for Machine Learning, we have achieved our first GA of Databricks Runtime ML! Databricks Runtime ML 提供適用于機器學習和資料科學的現成環境。Databricks Runtime ML provides a ready-to-go environment for machine learning and data science. 它是以 Databricks Runtime 為基礎,並新增許多熱門的機器學習程式庫,包括 TensorFlow、PyTorch、Keras 和 XGBoost。It builds on Databricks Runtime and adds many popular machine learning libraries, including TensorFlow, PyTorch, Keras, and XGBoost. 它也支援使用 Horovod 的分散式訓練。It also supports distributed training using Horovod.

此版本建立在 Databricks Runtime 5.3,其中包含額外的程式庫、一些不同的程式庫版本,以及適用于 Python 程式庫的 Conda 套件管理。This version is built on Databricks Runtime 5.3, with additional libraries, some different library versions, and Conda package management for Python libraries. 自 Databricks Runtime 5.2 ML Beta 版以來,主要的新功能包括:Major new features since Databricks Runtime 5.2 ML Beta include:

  • MLlib 與 MLflow (私用預覽) ,可為模型提供自動記錄 MLflow 執行,以配合使用 PySpark 微調演算法 CrossValidatorTrainValidationSplitMLlib integration with MLflow (Private Preview), which provides automatic logging of MLflow runs for models fit using the PySpark tuning algorithms CrossValidator and TrainValidationSplit.

    如果您想要參與預覽,請洽詢您的 Databricks 帳戶代表。If you want to participate in the preview, contact your Databricks account representative.

  • 升級至 PyArrow、Horovod 和 TensorboardX 程式庫。Upgrades to the PyArrow, Horovod, and TensorboardX libraries.

    PyArrow 更新新增了 BinaryType 當您執行以箭號為基礎的轉換,並使其可在 PANDAS UDF 中使用時,所能使用的功能。The PyArrow update adds the ability to use BinaryType when you perform Arrow-based conversion and makes it available in pandas UDF.

如需詳細資訊,請參閱 Databricks Runtime 5.3 ML (不支援的) For more information, see Databricks Runtime 5.3 ML (Unsupported). 如需建立 Databricks Runtime ML 叢集的指示,請參閱 Machine Learning 的 Databricks RuntimeFor instructions on creating a Databricks Runtime ML cluster, see Databricks Runtime for Machine Learning.