什么是 Azure Databricks?What is Azure Databricks?

Azure Databricks 是一个已针对 Microsoft Azure 云服务平台进行优化的数据分析平台。Azure Databricks is a data analytics platform optimized for the Microsoft Azure cloud services platform. Azure Databricks 提供了两种用于开发数据密集型应用程序的环境:Azure Databricks SQL Analytics 和 Azure Databricks 工作区。Azure Databricks offers two environments for developing data intensive applications: Azure Databricks SQL Analytics and Azure Databricks Workspace.

Azure Databricks SQL Analytics 为想要针对数据湖运行 SQL 查询、创建多种可视化类型以从不同角度探索查询结果,以及生成和共享仪表板的分析员提供了一个易于使用的平台。Azure Databricks SQL Analytics provides an easy-to-use platform for analysts who want to run SQL queries on their data lake, create multiple visualization types to explore query results from different perspectives, and build and share dashboards.

Azure Databricks 工作区提供了一个交互工作区,支持数据工程师、数据科学家和机器学习工程师之间的协作。Azure Databricks Workspace provides an interactive workspace that enables collaboration between data engineers, data scientists, and machine learning engineers. 使用大数据管道时,原始或结构化的数据将通过 Azure 数据工厂以批的形式引入 Azure,或者通过 Apache Kafka、事件中心或 IoT 中心进行准实时的流式传输。For a big data pipeline, the data (raw or structured) is ingested into Azure through Azure Data Factory in batches, or streamed near real-time using Apache Kafka, Event Hub, or IoT Hub. 此数据将驻留在 Data Lake(长久存储)、Azure Blob 存储或 Azure Data Lake Storage 中。This data lands in a data lake for long term persisted storage, in Azure Blob Storage or Azure Data Lake Storage. 在分析工作流中,使用 Azure Databricks 从多个数据源读取数据,并使用 Spark 将数据转换为突破性见解。As part of your analytics workflow, use Azure Databricks to read data from multiple data sources and turn it into breakthrough insights using Spark.

若要选择环境,启动一个 Azure Databricks 工作区,单击侧栏底部的“应用切换器”图标To select an environment, launch an Azure Databricks workspace, click the app switcher icon at the bottom of the sidebar Azure Databricks 应用切换器..

后续步骤Next steps