question

Vinay5-0499 avatar image
0 Votes"
Vinay5-0499 asked Vinay5-0499 commented

Accessing file name from nested folder structure and passing it to Notebook activity as a parameter in ADF.

Hello,

I have around 6 files in a nested folder structure as below.

data/year/Month/Day/time
File1.xml
File2.xml
... File6.xml.

I have to pass each file as a parameter to 6 different notebook activities.

So, I started to pass File1.xml as a parameter to a notebook activity. I used getmetadata activity and in field list I selected child items. It is giving only the year folder as output. I am unable to access file name from the nested folder.

Could you please let me know how to achieve this output.

Thank you.

azure-data-factory
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

KranthiPakala-MSFT avatar image
1 Vote"
KranthiPakala-MSFT answered

Hi @Vinay5-0499 ,

Here is the GIF of my pipeline. See if that helps:

93729-nestedfolderdata.gif


Below are screenshots of Getmetadata activity from second/child pipeline in which we get the file names from time folder:

  1. Dataset parameters:
    93677-image.png

  2. Dataset connection dynamic expression:
    93697-image.png

  3. GetMetada settings:
    93659-image.png

Hope this helps.



Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.





image.png (22.4 KiB)
image.png (58.9 KiB)
image.png (91.8 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

KranthiPakala-MSFT avatar image
1 Vote"
KranthiPakala-MSFT answered KranthiPakala-MSFT commented

Hi @Vinay5-0499,

Thanks for reaching out. As per my understanding I assume that you might be pointing your dataset file path to '/data/' and hence you are just receiving the year folder name.

In order to get the file name list, you will have to point your dataset file path to the sub-folder where the files are actually located.

For example as below:

93395-image.png

Which will result all the files inside the subfolder pointed in the file path of your dataset in Getmetadata activity.

93370-image.png


In case if you need to pass the year/month/date dynamically then you will have to declare dataset parameter for each (for year, month & date) and then use concatenation function to form the file path in Dataset connection settings. And in GetMetadata activtiy level you can use dynamic expression to get the year, month and date values based on your requirement.

Concatenation expression used in below sample: @concat('input/',dataset().ds_yearFolder,'/',dataset().ds_monthFolder,'/',dataset().ds_dateFolder)

Below is a sample GIF (Please note that in this sample I am passing static values for year, month & date to dataset parameters)

93326-dynamicdatefolder.gif

Hope this helps. Do let us know if you have further query.



Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.



image.png (17.9 KiB)
image.png (10.3 KiB)
· 7
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi Kranti,

Thanks for your reply.

Actually, my file will be present in the time folder.
For instance, a file will be placed in source path at 17:00 and sometime it will be between 10:00 AM to 14:00 . So, in such cases, there will be one more subfolder after day folder with TIME folder . For example, if the file is placed at 17:00, there will be a folder named 17. And, this 'TIME' folder changes everyday and it is not constant. So, how can i pass this time folder dynamically ,so that my pipeline picks the file from the TIME folder.

0 Votes 0 ·

Hi @Vinay5-0499,

Thanks for your reply and additional clarification. In order to get the time folder list for a particular day, you can have parent pipeline in which you use a Getmetadata activity in which you will point your dataset to the date folder and get the list of childItems nothing but the time folder names. and then have a subsequent ForEach activity and pass the output of previous getmetadata activity nothing the list of time folders and iterate through each folder and have an execute pipeline activity inside forEach activity from which you will pass the time folder name as a parameter to child pipeline in which you will have the above mentioned flow. This should help achieve your requirement.


Hope this input helps. Do let us know if you have further query.

0 Votes 0 ·

Hi Kranthi,

Thanks for your reply.

So, the should I implement solution -2 in conjunction with solution 1 provided above.
Or solution -2 should be implemented seperately. Which pipeline , will the execute activity trigger in this case.?
The time folder varies every day, and the pipeline should automatically search for folder in which file are available.

0 Votes 0 ·
Show more comments
Vinay5-0499 avatar image
0 Votes"
Vinay5-0499 answered Vinay5-0499 commented

Hi Kranthi,

Thanks for the detailed explanation. I go the output as required.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@KranthiPakala-MSFT

Hello Kranthi,

Further to the above requirement, I need to pass all the files present in the time folder as a parameter to a databricks notebook recursively. This notebook is going to parse all the files present in the time folder.
My adf pipeline should be able to pass each file from time folder one after the other as a parameter to this notebook. I have dbutil text command in the notebook. So, basically for a given day there will be 24 time folders and each time folder has different files, so I should pass each file from each time folder as a parameter to my databricks notebook.
Please let me know how this can be achieved.
Thank you.

0 Votes 0 ·