question

JainavSurana-4280 avatar image
0 Votes"
JainavSurana-4280 asked HimanshuSinha-MSFT answered

Integration - Azure Data Factory - How to copy all records from CSV kept in data lake to on-premises SQL database

I am trying to pass all records of csv from data lake to on-premises database of SQL using Azure data factory. I know how to pass data one by one using LookUp and Copy activity but I don't know how to pass all records of csv.
Before insertion into database I want to validate all records are correct, not corrupted basically as per our logic so I was thinking to pass all records and validate. If all records are correct then dump into multiple tables of a database otherwise reject the CSV.
Can you please share your inputs if anyone have any idea

azure-data-factoryazure-data-lake-storage
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

HimanshuSinha-MSFT avatar image
0 Votes"
HimanshuSinha-MSFT answered

Helo @JainavSurana-4280 ,

Just to add to what @NandanHegde-7720 said above , you can use the below logic also


  1. Copy all the data into an staging table using copy activiy .

  2. You can use Lookup activity to call a proc on the SQL to check all the validation . The proc must return a datatset

  3. Using the lookup data to loop in to copy activity and copy the data to the right sql table .

  4. delete the staging tables .

Please do let me know how it goes .
Thanks
Himanshu
Please do consider clicking on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

NandanHegde-7720 avatar image
0 Votes"
NandanHegde-7720 answered NandanHegde-7720 commented

Hey,
You can use copy activity with your csv file as source and Azure SQL db as the table.
The copy activity would copy all records from file into table.

There is no need for a lookup activity.

There is an open source tool Great_Expectations:
https://greatexpectations.io/
that is used to scan the files and have proper data validations before that is ingested into system.

You can leverage that before the copy activity wherein you can run all your expectation cases and proceed to copy activity only if all are success.


Else, you can load the data into a staging table in Azure database and via Stored procedure activity validate all your data in staging table and then move data into other tables within database if all cases are verified

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@NandanHegde-7720
I modified my question little bit. Can you please check again

0 Votes 0 ·

Hey, I have updated my answer.

Please accept it as answer in case if it was helpful as it helps other community members as well.

0 Votes 0 ·