question

james-h-robinson avatar image
0 Votes"
james-h-robinson asked ThomasBoersma commented

Delta Lake with Databricks and Synapse

What advantages are there to using Databricks and Synapse with Delta Lake? I understand that AAS can connect to Synapse but not Databricks. If that's the case, why not just use Synapse with Delta Lake?

azure-synapse-analyticsazure-databricks
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

PRADEEPCHEEKATLA-MSFT avatar image
1 Vote"
PRADEEPCHEEKATLA-MSFT answered ThomasBoersma commented

Hello @james-h-robinson,

Thanks for the question and using MS Q&A platform.

What is Delta Lake?

Delta Lake is an open-source data format that enables you to update your big data sets with guaranteed ACID transaction behavior.

Delta Lake is a layer placed on top of your existing Azure Data Lake data that can be fully managed using Apache Spark APIs available in both Azure Synapse and Azure Databricks.

127866-image.png

Delta Lake is one of the most popular updateable big data formats in big data solutions, and frequently used by many data engineers who need to prepare, clean, or update data stored in data lake, or apply machine learning experiments.

Yes, you cannot connect to Azure Analysis services is not supported as a source in Azure Databricks.

Reason: Supported Data Sources in Azure Databricks and Data Sources in Azure Analysis Services.

If you are using Delta Lake with Azure Synapse Analytics, you can connect with Azure Analysis services.

Data sharing without copy, load, or transformation of Delta Lake files is the main benefit of serverless SQL pools. The serverless endpoint in Azure Synapse represents a bridge between a reporting/analytics layer where you use Power BI or Azure Analysis Services, and your data stored in Delta Lake format.

127807-image.png

For more information, refer to Query Delta Lake files using T-SQL language in Azure Synapse Analytics.

Hope this helps. Do let us know if you any further queries.


Please "Accept the answer" if the information helped you. This will help us and others in the community as well.


image.png (146.8 KiB)
image.png (150.7 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

In addition to @PRADEEPCHEEKATLA-MSFT :

Delta lake makes it also easier to partition your tables or views within Azure Synapse Serverless Pool. When partitions are used correct it will often result in higher performance and reducing in costs, because not all the data is read. The difference when using raw Parquet files (without Delta Lake) is that you need to use the filename and filepath functions to only read the partitioned data (see link). This makes the queries more complex in my opinion.
But when you use Delta Lake, Azure Synapse sees automatically the partitions you created and you don't longer need to use the filename nor filepath functions anymore (see link).

1 Vote 1 ·