How to get only filenames,lastmodified date of respective files from GetMetadata Activity . The folder structure is Hybrid

Krishnamohan Nadimpalli 401 Reputation points
2022-04-20T20:44:08.67+00:00

Hi All

My ADLS folder structure is Hybrid like the following

container/business1/2020/20/01/business_file1.csv
container/business2/2020/business_file2.csv
container/business3/business_file3.csv
container/raw/business/sub-business/bus-sub-file.csv

Now my output should be something like this
filename,lastmodifieddate
business_file1.csv,2020-12-20
business_file2.csv,2019-12-10
business_file3.csv,2021-12-12
bus-sub-file.csv,2020-12-13

How do I loop through Metadata recursively but with hybrid folder structure.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,544 questions
0 comments No comments
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 30,676 Reputation points Microsoft Employee
    2022-04-21T18:46:43.533+00:00

    Hi @Krishnamohan Nadimpalli ,
    Welcome to Microsoft Q&A platform and thanks for posting your query.
    As I understand your question you want to fetch the fileNames and lastModified date for each files present in any folder and subFolder inside a container of ADLS. Please let me know if my understanding is incorrect.

    For this purpose, you need to use nested loop concept. Taking the example you provided:

    • Use GetMetadata activity pointing the dataset to the container and select argument as child Items .
    • Use ForEach block with Items as @activity('Get Metadata1').output.childItems to loop through each folders present .
    • Use IF block to check the condition @equals(item().Type,'Folder') . If the item is a folder , then call another pipeline via Execute pipeline activity.
    • Inside IF block, true condition, add another GetMetadata Activity to get the child Items and last modified.

    At this stage, you would be able to get the desired output for container/business3/business_file3.csv .However, for the other cases, you need to keep applying this logic for quite a few times

    Easier approach for this would be to write customized code using Python or C# and use Azure function or Azure data bricks.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful