Azure Data Factory - Create Dynamic Destination Folders for Historical Load

Kia 31 Reputation points
2020-10-09T21:12:27.557+00:00

We wish to ingest data from an on-prem SQL DB into ADLS Gen 2 using ADF.
The desired folder structure will be -
<Object>
YYYY
MM
DD
<file><datetime>

How do I go about creating a pipeline that will load historical data into the ADLS folder structure above.
I am clear on how to do this for an ingestion point forward, but would like guidance for historical load.

I would go with either create date/update date of record from source table for organizing the files in the ADLS.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,858 questions
{count} votes

Accepted answer
  1. HarithaMaddi-MSFT 10,136 Reputation points
    2020-10-12T09:21:50.677+00:00

    Hi @Kia ,

    Welcome to Microsoft Q&A Platform. Thanks for posting the query.

    I have implemented it till year and month level and same can be extended to dates in the data. I have used couple of stored procedures in on-premise SQL as below

    • dbo.getmonths with input year parameter to retrieve months for the specific year from the table
    • dbo.getdata with input year and month parameters to retrieve data for specific month and year from the table

    Design includes following components

    • Lookup - to retrieve list of years and months from the table, this can be added for days as well
    • Foreach activity - to loop through years/months based on lookup output
    • Execute Pipeline - Since foreach inside foreach is not supported in azure data factory, implemented the copy activity in child pipeline with input parameter as year. If extended to days, the input parameters can be both year and month
    • Copy Activity - This activity is used to load the data from the SQL to blob with dynamic folders using month and year

    Below GIF shares the details on implementation. Also, attached the json of pipelines to this.
    31625-parentpipelinejson.txt
    31626-childpipelinejson.txt
    31568-dynamicfolderdataadf.gif

    Hope this helps! Please let us know if it does not align with the requirement or for further queries and we will be glad to assist.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Chandu, Anil K 0 Reputation points
    2021-05-26T05:22:15.403+00:00

    hi,

    Daily I am getting N no.of files with the file name as 'ABC_YYYYMMDD.csv' format. How can I create the Year & Month folders dynamically in adls gen2 and how can I load the files based on the year and month?

    Ex: All the files coming in Jan should be loaded in Jan folder, Feb files should be loaded in Feb folder.. etc