Big data analytics with Azure Data Explorer

Data Explorer
Event Hubs
IoT Hub
Storage
Synapse Analytics

Solution Idea

If you'd like to see us expand this article with more information, implementation details, pricing guidance, or code examples, let us know with GitHub Feedback!

This solution idea demonstrates big data analytics over large volumes of high velocity data from various sources. This solution illustrates how Azure Data Explorer and Azure Synapse Analytics complement each other for near real-time analytics and modern data warehousing use cases.

This solution is already being used by Microsoft customers. For example, the Singapore-based ride-hailing company, Grab, implemented real-time analytics over a huge amount of data collected from their taxi and food delivery services as well as merchant partner apps. The team from Grab presented their solution at MS Ignite in this video (20:30 onwards). Using this pattern, Grab processed more than a trillion events per day.

Big data analytics with Azure Data Explorer

Data flow

  1. Raw structured, semi-structured, and unstructured (free text) data such as any type of logs, business events, and user activities can be ingested into Azure Data Explorer from various sources.
  2. Ingest data into Azure Data Explorer with low-latency and high throughput using its connectors for Azure Data Factory, Azure Event Hub, Azure IoT Hub, Kafka, and so on. Alternatively, ingest data through Azure Storage (Blob or ADLS Gen2), which uses Azure Event Grid and triggers the ingestion pipeline to Azure Data Explorer. You can also continuously export data to Azure Storage in compressed, partitioned parquet format and seamlessly query that data as detailed in the Continuous data export overview.
  3. Export pre-aggregated data from Azure Data Explorer to Azure Storage, and then ingest the data into Synapse Analytics to build data models and reports.
  4. Use Azure Data Explorer’s native capabilities to process, aggregate, and analyze data. To get insights at a lightning speed, build near real-time analytics dashboards using Azure Data Explorer dashboards, Power BI, Grafana, or other tools. Use Azure Synapse Analytics to build a modern data warehouse and combine it with the Azure Data Explorer data to generate BI reports on curated and aggregated data models.
  5. Azure Data Explorer provides native advanced analytics capabilities for time series analysis, pattern recognition, anomaly detection and forecasting, and machine learning. Azure Data Explorer is also well integrated with ML services such as Databricks and Azure Machine Learning. This integration allows you to build models using other tools and services and export ML models to Azure Data Explorer for scoring data.

Components

  • Azure Event Hub: Fully managed, real-time data ingestion service that’s simple, trusted, and scalable.
  • Azure IoT Hub: Managed service to enable bi-directional communication between IoT devices and Azure.
  • Kafka on HDInsight: Easy, cost-effective, enterprise-grade service for open source analytics with Apache Kafka.
  • Azure Data Explorer: Fast, fully managed and highly scalable data analytics service for real-time analysis on large volumes of data streaming from applications, websites, IoT devices, and more.
  • Azure Data Explorer Dashboards: Natively export Kusto queries that were explored in the Web UI to optimized dashboards.
  • Azure Synapse Analytics: Analytics service that brings together enterprise data warehousing and Big Data analytics.

Next steps

For more information, see Azure Data Explorer documentation.