Advanced analytics architecture

Azure Analysis Services
Azure Blob Storage
Azure Cosmos DB
Azure Synapse Analytics
Power BI

Solution ideas

This article is a solution idea. If you'd like us to expand the content with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know by providing GitHub feedback.

This architecture allows you to combine any data at any scale with custom machine learning and get near real-time data analytics on streaming services.

Architecture

Diagram of an advanced analytics architecture using Azure Synapse Analytics with Azure Data Lake Storage, Azure Analysis Services, Azure Cosmos DB, and Power BI.

Download a Visio file of this architecture.

Dataflow

  1. Bring together all your structured, unstructured, and semi-structured data (logs, files, and media) using Synapse Pipelines to Azure Data Lake Storage.
  2. Use Apache Spark pools to clean and transform the structureless datasets and combine them with structured data from operational databases or data warehouses.
  3. Use scalable machine learning/deep learning techniques, to derive deeper insights from this data using Python, Scala, or .NET, with notebook experiences in Apache Spark pool.
  4. Apply Apache Spark pool and Synapse Pipelines in Azure Synapse Analytics to access and move data at scale.
  5. Query and report on data in Power BI.
  6. Take the insights from Apache Spark pools to Azure Cosmos DB to make them accessible through web and mobile apps.

Workflow

  • Azure Synapse Analytics is the fast, flexible, and trusted cloud data warehouse that lets you scale, compute, and store elastically and independently, with a massively parallel processing architecture.
  • Synapse Pipelines Documentation allows you to create, schedule, and orchestrate your ETL/ELT workflows.
  • Azure Blob storage is a Massively scalable object storage for any type of unstructured data-images, videos, audio, documents, and more-easily and cost-effectively.
  • Azure Synapse Analytics Spark pools is a fast, easy, and collaborative Apache Spark-based analytics platform.
  • Azure Cosmos DB is a globally distributed, multi-model database service. Learn how to replicate your data across any number of Azure regions and scale your throughput independent from your storage.
  • Azure Synapse Link for Azure Cosmos DB enables you to run near real-time analytics over operational data in Azure Cosmos DB, without any performance or cost impact on your transactional workload, by using the two analytics engines available from your Azure Synapse workspace: SQL Serverless and Spark Pools.
  • Azure Analysis Services is an enterprise grade analytics as a service that lets you govern, deploy, test, and deliver your BI solution with confidence.
  • Power BI is a suite of business analytics tools that deliver insights throughout your organization. Connect to hundreds of data sources, simplify data prep, and drive unplanned analysis. Produce beautiful reports, then publish them for your organization to consume on the web and across mobile devices.

Alternatives

  • Synapse Link is the Microsoft preferred solution for analytics on top of Azure Cosmos DB data.

Scenario details

Transform your data into actionable insights using the best-in-class machine learning tools. This solution allows you to combine any data at any scale, and to build and deploy custom machine learning models at scale. To learn how enterprise-scale data platforms are designed as part of an enterprise landing zone, refer to the Cloud Adoption Framework Data landing zone documentation.

Potential use cases

Organizations have the ability to access more data than ever before. Advanced analytics help take advantage of data insights. Areas include:

  • Customer service.
  • Predictive maintenance.
  • Recommending products or services.
  • System optimization of everything from supply chains to data center operations.
  • Product and services development.

Considerations

Cost optimization

Cost optimization is about looking at ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Overview of the cost optimization pillar.

Next steps

See the following documentation about the services featured in this architecture: