question

Vinay5-0499 avatar image
0 Votes"
Vinay5-0499 asked HimanshuSinha-MSFT commented

Copying data to Azure Blob from Amazon S3 bucket.

Hello,

I have a requirement to copy data from Amazon S3 to Azure BloB. There are 10 objects in this bucket and each objects has files in the below format.
Month Folder> Date Folder> Day Folder< Hour Folder . For example the files for 2021-03-23 (at 17:00(time)) are present in the folder path 2021<03<23<17.

The files are placed in S3 bucket daily. I have to copy only the latest files placed in S3 to Azure. So, as in when the files are placed at particular time window in S3 ,only the latest delta files from that particular object should be copied. In this way, I should be able to copy files from all the objects. Could you please let me know how to achieve this.

Thank you.

azure-data-factoryazure-databricks
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @Vinay5-0499 ,
Thanks for the ask and using the Microsoft Q&A platform .My apoloziges for not repling sooner .
Just a quick clarification
"The files are placed in S3 bucket daily. I have to copy only the latest files placed in S3 to Azure." and the folder structure you mentioned is " 2021<03<23<17"

So the file inside the max hour foldername should have the latest file correct ? Also how many files are there in the folder one or more then one ? Does the file also follow a particular naming conventions ?

Please reply back and I will try to post a solution . Will wait to hear from you on this .

Thanks
HImanshu

0 Votes 0 ·
Vinay5-0499 avatar image Vinay5-0499 HimanshuSinha-MSFT ·

@HimanshuSinha-MSFT

Hi Himanshu,

Please find the below.

Its not only the max hour folder, its all the files that belongs to yesterday. For instance, if we are running the pipeline today, all the files from yesterday ,should be picked.so, there might be different hour folders and we need to copy files from all the hours folder for a given date.

There are more than 1 file and it doesn't have naming convention. But there are different file formats like .csv, .json, .xml,.dat. Can I use a single copy activity to copy the files of different formats(.csv, .json, .xml,.dat) into Azure

0 Votes 0 ·

Hello @Vinay5-0499 ,

Please do send an email to azcommunity@microsoft.com with the Subject : Attn Himanshu . I will try to help you on this .

Thanks
Himanshu

0 Votes 0 ·

0 Answers