question

LuukvanVliet-7731 avatar image
0 Votes"
LuukvanVliet-7731 asked LuukvanVliet-7731 commented

Pull data from Rest api transfom in databricks and insert in azure sql database

Hello,

We have a scenario that we cannot solve with Azure Data Factory, therefore we need to schedule a python script that pulls data from multiple REST apis, transforms the data and inserts it into azure sql database.

What is a best practise to insert a pandas dataframe into azure sql? Currently I am using sqlalchemy (after installing ODBC drivers in azure databricks) but is very slow. I have heard of SparkSQL and Pyspark, but are they really speeding up insertion of records in azure sql database? Any references to a solution that fits my needs?

azure-databricks
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

PRADEEPCHEEKATLA-MSFT avatar image
0 Votes"
PRADEEPCHEEKATLA-MSFT answered PRADEEPCHEEKATLA-MSFT commented

Hello @LuukvanVliet-7731,

Welcome to the Microsoft Q&A platform.

You can connect Azure SQL database using JDBC or Apache Spark Connector.

The Spark connector for Azure SQL Database and SQL Server enables these databases to act as input data sources and output data sinks for Apache Spark jobs. It allows you to use real-time transactional data in big data analytics and persist results for ad-hoc queries or reporting.

  • Compared to the built-in JDBC connector, this connector provides the ability to bulk insert data into SQL databases.

  • It can outperform row-by-row insertion with 10x to 20x faster performance. The Spark connector for SQL Server and Azure SQL Database also supports Azure Active Directory (AAD) authentication.

  • It allows you to securely connect to your Azure SQL databases from Azure Databricks using your AAD account. It provides interfaces that are similar to the built-in JDBC connector.

  • It is easy to migrate your existing Spark jobs to use this connector.

    Note: A Spark connector that supports Databricks Runtime 7.x is not yet available. Databricks recommends that you use the JDBC connector or Databricks Runtime 6.6 or below.


For more details, refer to the below links:

SQL Databases using JDBC

SQL Databases using the Apache Spark connector

Hope this helps. Do let us know if you any further queries.


  • Please accept an answer is correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.

  • Want a reminder to come back and check responses? Here is how to subscribe to a notification.


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @LuukvanVliet-7731,

Just checking in to see if the above answer helped. If this answers your query, do click “Accept Answer” and Up-Vote for the same. And, if you have any further query do let us know.

0 Votes 0 ·
AnmolGanju-9801 avatar image
0 Votes"
AnmolGanju-9801 answered LuukvanVliet-7731 commented

Why you cannot use ADF? for this scenerio using HTTP REST API? @PRADEEPCHEEKATLA-MSFT @LuukvanVliet-7731 ? ADF gives wast majority of REST API Integration to directly dump the data into Azure Datawarehouse or Azure SQL Database?

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

We have tried ADF but the JSON is deeply nested and we only need a part of it.
Now that I explored databricks and azure functions we have 2 options that suit!

0 Votes 0 ·