question

Vinay5-0499 avatar image
0 Votes"
Vinay5-0499 asked HimanshuSinha-MSFT commented

Copying data from Amazon S3 to Azure blob storage.

Hello ,

I have a requirement of copying S3 data to Azure blob.
The folder structure in Amazon S3 is as below.
Year folder📁- Month folder 📁- Day folder📁- Date folder📁 ( Ex: 2018-03-10). So there is data from 2017 to 2021 for almost every day. I should be creating a ADF pipeline which copies the data to Azure based on the below scenarios.
1.loading data between two dates. (Ex - 2018-10-10 to 2018-12-10).
2 Once the history data is loaded, I should implement a logic for the delta loading. The data comes to S3 daily.


Could some please assist on achieving the above.

Thank you in advance.



azure-data-factoryazure-blob-storage
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

HimanshuSinha-MSFT avatar image
0 Votes"
HimanshuSinha-MSFT answered Vinay5-0499 commented

Hello @Vinay5-0499,
Thanks for the ask and using the Microsoft Q&A platform .

1.loading data between two dates. (Ex - 2018-10-10 to 2018-12-10).

Since the dates in you are case are known , I will try to create a array and add the date in that ( I used excel , but you can use any text editor) , then use a for each loop ( FE loop ) and inside that I will add a copy activity with a parameterized dataset . The parameter which we will pass is the is the date element from the array .
Below animation and snapshot should help .

79257-s3-issue.png



79318-image.png



  1. When you say 'Delta loading " , I am assuming that you want to copy the files every day , on a fixed schedule . If that case then you use the same pipeline , remove array and FE each and create a the date like 2021-03-18 when dynamic expression with the date expression . You will have to use the dataset paramterized .



Please do let me know how it goes .
Thanks
Himanshu
Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members




s3-issue.png (2.0 MiB)
image.png (54.5 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@HimanshuSinha-MSFT

Hi Himanshu,

Thanks for your answer. While I was waiting for the reply to my question, I tried the logic given in the below link and it worked.

https://docs.microsoft.com/en-us/answers/questions/162895/adf-pipeline-to-increment-the-date-back-to-six-mon.html

However, I got stuck at delta loading.
The files in S3 bucket gets added randomly everyday and I have to copy them to Azure as in when they arrive.
Could you please elaborate on how to achieve this requires.

0 Votes 0 ·
Vinay5-0499 avatar image
0 Votes"
Vinay5-0499 answered HimanshuSinha-MSFT commented

Hi Himanshu,

Thanks for your answer. While I was waiting for the reply to my question, I tried the logic given in the below link and it worked.

https://docs.microsoft.com/en-us/answers/questions/162895/adf-pipeline-to-increment-the-date-back-to-six-mon.html

However, I got stuck at delta loading.
The files in S3 bucket gets added randomly everyday and I have to copy them to Azure as in when they arrive.
Could you please elaborate on how to achieve this requires.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.