2019 年 6 月June 2019

這些功能和 Azure Databricks 平臺改進已于2019年6月發行。These features and Azure Databricks platform improvements were released in June 2019.

注意

發行是暫存的。Releases are staged. 在初始發行日期之後,您的 Azure Databricks 帳戶可能不會更新到一周。Your Azure Databricks account may not be updated until up to a week after the initial release date.

Lsv2 執行個體支援已正式推出Lsv2 instance support is generally available

2019年6月 24-26:版本2.100June 24 - 26, 2019: Version 2.100

Azure Databricks 現已針對高輸送量和高 IOPS 工作負載,提供 LSV2 VM 系列 的完整支援。Azure Databricks now provides full support for the Lsv2 VM series for high-throughput and high-IOPS workloads.

RStudio 整合不再受限於高並行存取叢集RStudio integration no longer limited to high concurrency clusters

2019年6月 6-11:版本2.99June 6 - 11, 2019: Version 2.99

現在除了已支援的高平行存取叢集之外,您還可以在 Azure Databricks 的 標準 叢集上啟用 RStudio 伺服器。Now you can enable RStudio Server on standard clusters in Azure Databricks, in addition to the high-concurrency clusters that were already supported. 無論叢集模式為何,RStudio Server 整合都會繼續要求您停用叢集的 自動終止 選項。Regardless of cluster mode, RStudio Server integration continues to require that you disable the automatic termination option for your cluster. 請參閱 Azure Databricks 上的 RStudioSee RStudio on Azure Databricks.

MLflow 1.0MLflow 1.0

2019 年 6 月 3 日June 3, 2019

MLflow 是一個開放原始碼平臺,可管理完整的機器學習生命週期。MLflow is an open source platform to manage the complete machine learning lifecycle. 透過 MLflow,資料科學家可以在本機或雲端中追蹤和共用實驗、跨架構封裝和共用模型,以及在任何地方部署模型。With MLflow, data scientists can track and share experiments locally or in the cloud, package and share models across frameworks, and deploy models virtually anywhere.

今天我們很高興宣佈推出 MLflow 1.0。We are excited to announce the release of MLflow 1.0 today. 1.0 版本不僅會標示 Api 的成熟度和穩定性,還會新增一些經常要求的功能和增強功能:The 1.0 release not only marks the maturity and stability of the APIs, but also adds a number of frequently requested features and improvements:

  • CLI 已重新組織,現在具有適用于成品、模型、db (追蹤資料庫) 的專用命令,以及追蹤伺服器) 的伺服器 (。The CLI was reorganized and now has dedicated commands for artifacts, models, db (the tracking database), and server (the tracking server).
  • 追蹤伺服器搜尋支援簡化版本的 SQL WHERE 子句。Tracking server search supports a simplified version of the SQL WHERE clause. 除了支援執行計量和參數之外,還增強了搜尋功能,以支援一些執行屬性和使用者和系統磁碟區標。In addition to supporting run metrics and params, search has been enhanced to support some run attributes and user and system tags.
  • 在追蹤 API 中新增 x 座標的支援。Adds support for x coordinates in the Tracking API. MLflow UI 視覺效果元件現在也支援根據提供的 x 座標值繪製度量。The MLflow UI visualization components now also supports plotting metrics against provided x-coordinate values.
  • 新增 runs/log-batch REST API 端點以及 Python、R 和 JAVA 方法,以單一 API 要求記錄多個計量、參數和標記。Adds a runs/log-batch REST API endpoint as well as Python, R, and Java methods for logging multiple metrics, parameters, and tags with a single API request.
  • 為了進行追蹤,Windows 現在支援 MLflow 1.0 用戶端。For tracking, the MLflow 1.0 client is now supported on Windows.
  • 新增 HDFS 的支援做為成品存放區後端。Adds support for HDFS as an artifact store backend.
  • 新增命令來建立 Docker 容器,此容器的預設進入點會在容器內的埠8080上提供指定的 MLflow Python 函式模型。Adds a command to build a Docker container whose default entry point serves the specified MLflow Python function model at port 8080 within the container.
  • 新增實驗性 ONNX 模型類別。Adds an experimental ONNX model flavor.

您可以在 MLflow 變更記錄檔中查看變更的完整清單。You can view the full list of changes in the MLflow Change log.

搭配 Conda 的 Databricks Runtime 5.4 (Beta)Databricks Runtime 5.4 with Conda (Beta)

2019 年 6 月 3 日June 3, 2019

重要

Conda 的 Databricks Runtime 是搶鮮 版(Beta)。Databricks Runtime with Conda is in Beta. 在即將推出的 Beta 版中,支援的環境內容可能會變更。The contents of the supported environments may change in upcoming Beta releases. 變更可以包含套件或已安裝套件版本的清單。Changes can include the list of packages or versions of installed packages. 使用 Conda 的 Databricks Runtime 5.4 是以 Databricks Runtime 5.4 (不支援的) 為基礎。Databricks Runtime 5.4 with Conda is built on top of Databricks Runtime 5.4 (Unsupported).

我們很高興推出 Databricks Runtime 5.4 與 Conda,可讓您利用 Conda 來管理 Python 程式庫和環境。We’re excited to introduce Databricks Runtime 5.4 with Conda, which lets you take advantage of Conda to manage Python libraries and environments. 此執行時間會在叢集建立時提供兩個根 Conda 環境選項:This runtime offers two root Conda environment options at cluster creation:

  • Databricks Standard 環境包含許多熱門 Python 套件的更新版本。Databricks Standard environment includes updated versions of many popular Python packages. 此環境是為了取代在 Databricks Runtime 上執行的現有筆記本。This environment is intended as a drop-in replacement for existing notebooks that run on Databricks Runtime. 這是預設的 Databricks Conda 執行時間環境。This is the default Databricks Conda-based runtime environment.
  • Databricks 最 小環境包含 PySpark 和 Databricks Python 筆記本功能所需的最低套件。Databricks Minimal environment contains the minimum packages required for PySpark and Databricks Python notebook functionality. 如果您想要使用各種 Python 套件自訂執行時間,則此環境很理想。This environment is ideal if you want to customize the runtime with various Python packages.

請參閱 Databricks Runtime 5.4 的完整版本資訊 ,Conda (Beta) See the complete release notes at Databricks Runtime 5.4 with Conda (Beta).

適用於 Machine Learning 的 Databricks Runtime 5.4Databricks Runtime 5.4 for Machine Learning

2019 年 6 月 3 日June 3, 2019

Databricks Runtime 5.4 ML 建置於 Databricks Runtime 5.4 (不支援的) 上。Databricks Runtime 5.4 ML is built on top of Databricks Runtime 5.4 (Unsupported). 它包含許多熱門的機器學習程式庫,包括 TensorFlow、PyTorch、Keras 和 XGBoost,並使用 TensorFlow 提供分散式 Horovod 訓練。It contains many popular machine learning libraries, including TensorFlow, PyTorch, Keras, and XGBoost, and provides distributed TensorFlow training using Horovod.

其中包含下列新功能:It includes the following new features:

  • MLlib 與 MLflow 整合 (公開預覽) 。MLlib integration with MLflow (Public Preview).
  • 已預先安裝新 SparkTrials 類別的 Hyperopt (公開預覽) 。Hyperopt with new SparkTrials class pre-installed (Public Preview).
  • 從 Horovod 傳送到 Spark 驅動程式節點的 HorovodRunner 輸出現在會顯示在筆記本儲存格中。HorovodRunner output sent from Horovod to the Spark driver node is now visible in notebook cells.
  • 已預先安裝 XGBoost Python 套件。XGBoost Python package pre-installed.

如需詳細資訊,請參閱 Databricks Runtime 5.4 Machine Learning (不支援的) For details, see Databricks Runtime 5.4 for Machine Learning (Unsupported).

Databricks Runtime 5.4Databricks Runtime 5.4

2019 年 6 月 3 日June 3, 2019

Databricks Runtime 5.4 現在已可供使用。Databricks Runtime 5.4 is now available. Databricks Runtime 5.4 包括 Apache Spark 2.4.2、升級的 Python、R、JAVA 和 Scala 程式庫,以及下列新功能:Databricks Runtime 5.4 includes Apache Spark 2.4.2, upgraded Python, R, Java, and Scala libraries, and the following new features:

  • Databricks 上的 Delta Lake 新增了自動優化 (公開預覽) Delta Lake on Databricks adds Auto Optimize (Public Preview)
  • 使用您最愛的 IDE 和筆記本伺服器搭配 Databricks ConnectUse your favorite IDE and notebook server with Databricks Connect
  • 程式庫公用程式已正式推出Library utilities generally available
  • 二進位檔案資料來源Binary file data source

如需詳細資訊,請參閱 Databricks Runtime 5.4 (不支援的) For details, see Databricks Runtime 5.4 (Unsupported).