How to replicate data from different backends that are using SQL Servers (all hosted on Azure) to do Reports. What Azure Products do I need?

Question

Suppose I have hosted on the same Resource on Azure 4 Web APIS on .NET Core with SQL Server. Each system has their own SQL Server Database on Azure. The systems are communicated using Azure Service Bus.

My objetive is to use Power BI or a similar software to create dynamic reports without impacting the performance of each application.

But the issue that I am facing is that I dont know where and how to store that (replicated) data that I want to view in a future with the previously mentioned Software.

I dont need all the data of all databases, only a few tables of each system.

For example: From System 1, I want the Data of the table Clients, from System 2 I want the data of the table Transactions and so on.

I dont want to affect the performance of the Databases of each system when I generate the reports, so I thought that maybe Azure has a way to replicate that tables in some place automatically.

To do this, what Azure Service do I need? Azure Synapse Analytic (DW)? Also do I need another product? Or maybe is there a better approach?

I will really appreciate some info to achieve what I want using Azure

Thanks in advance,

Pablo.

Answer

This isn’t a question to answer in few sentences. It depend how much data you’ll load to Power BI, should it be basically transferred and visualized? How much data you load in every cycle how much data you’ll store in all? There are many questions to answer before giving a architectural recommendation.

Basic approach could be to load data directly to the Power BI service. Or use Azure Data Factory or Azure Synapse Studio (Currently in preview) to ELT/ETL the data to another Data store. If you need the DWH features of Synapse use them.

Sorry but with the given information I can’t give a better recommendation. May the first approach sufficient? Go for it!

Answer

If your azure sql databases are on premium you have read scale out. See the main man james serra: https://www.jamesserra.com/archive/2018/09/azure-sql-database-read-scale-out/

You can then use synapse polybase (data virtualization) to hit these or power bi directly or import as your use cases require

Answer

Hi!

I ca try to make a suggestion for your goal.

If you have a small/medium size date you can do the following stuff.

Replicate your data into a datalake storage account using databricks and delta lake format for persistence (so you can do merge and other nice transactional operations).
From delta table you can build a more "cleaned" layer using parquet files so can other tools can read them.
Use Snapse Analytics SQL on demand tools for reading the whole parquet data when you need it (for example in a PowerBI Table refresh) or a direct query.

Synapse Analytcs now is in preview again but I hope this product will reach the stable status asap.
If you have money to spend, consider also an ever running databricks cluster that offers to you also a "spark queryable endpoint" if you want reach your data using jdbc/odbc connector.

Feel free to ask for clearifications.

Ra

How to replicate data from different backends that are using SQL Servers (all hosted on Azure) to do Reports. What Azure Products do I need?

3 answers