Hello,
I have around n number of files in a nested folder structure as below.
data/year/Month/Day/time
File1.xml
File2.xml
... File..n.xml.
I have to pass each file as a parameter to a Databricks notebook,one after the other.
For instance, once the file1 is passed as a parameter to a notebook and when the execution of notebook completes , we need to pass file2 as a parameter to the same notebook from adf dynamically.
Example:
For a given day lets say 2021-05-13, the folder structure will be like Sourcename/Year(2021)/month(05)/day(13)/timefolder(1).
So, in the time folder(1) there will be 7 files, I will have to pass each file one after the other as a parameter to databricks notebook.
Once the timefolder(1) is completed,there will be timefolder(2) and it has somefiles and we have to pass these files as a prameter to databricks notebook. In this way, for the given date and time folders, we will have to pass each file as a parameter to the notebook dynamically. Note: The notebook and the logic in the notebook is constant.
I used getmetadata activiy and was able to acheive a part of this requirement, where there is one timefolder and 1 file in the timefolder, but unable to design a pipeline for n timefolders and n files in it.
Could someone please assist.