ADF: copy last modified blob

Tal Zadok 21 Reputation points
2021-04-08T13:41:32.26+00:00

Hi,
I'd like to copy the last modified blob from an Azure container using the copy activity (to Azure Data Explorer, but that does not relevant for the question :)).
Note: it is possible that N>1 blobs were added since last pipeline run, but am only interested in last modified one.
How can I achieve this?
I was thinking about on of the 2 directions below:
1 - Is it possible to configure copy activity to retrieve last modified in "Source" linked service?
2 - If using "Get Metadata" activity that outputs blob name & modification date, how can I configure Filter activity to filter by modification date and output blob name?

other suggestions are welcome.

Thanks,
Tal

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,528 questions
{count} votes

Accepted answer
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2021-04-09T07:32:48.503+00:00

    Hi @Tal Zadok ,

    Thanks for reaching out. You will have to use GetMetadata activity to get the list of files and then loop through each file to get the last modified date of file, name and load those values to 2 set variable activities (1 to store file modified date and other to store file name which should be used in the actual copy activity to process the file)

    To do this first you have to declare two variables - varReferenceDateTime = 1900-01-01 00:00:00 this we take a default value to check if the file date is greater than this value and if yes, then we assign file modified value to this variable, and the other variable is varLatestFileName we leave it empty and once we get the file modified data and condition is passed then we assign the file name value to this variable inside IfCondition activity. The ForEach activity iterates through all the files and after the last iteration is completed, those 2 variables will have the last modified file date and the file name which will be used in Copy activity which is outside of ForEach activity.

    86146-image.png

    1. Declare variables -> varReferenceDateTime = 1900-01-01 00:00:00 & varLatestFileName
    2. getListOfFileNames -> Get child items which is nothing but the list of file names
    3. loopThroughAllTheFiles -> ForEach to loop through each file. - > items = @activity('getListOfFileNames').output.childItems, make sure sequential box is checked
    4. Inside ForEach -> getLastModifiedDateOfTheCurrentIterationFile -> to get current iteration file modified date and name (We use Item name & Last Modified arguments)
    5. conditionToCheckIfFileDateGreaterThanSetDate -> If Condition Activity to check if file modified date is greater than varReferenceDateTime. Here is the condition @greater(ticks(activity('getLastModifiedDateOfTheCurrentIterationFile').output.lastModified),ticks(formatDateTime(variables('varReferenceDateTime'))))
    6. If condition passes -> setFileLastModifiedDate - Set variable activity to load the Last modified value of the current file - varReferenceDateTime = @activity('getLastModifiedDateOfTheCurrentIterationFile').output.lastModified
    7. Next we have another set variable activity to load the current file name --> setLatestFileName -> varLatestFileName = @activity('getLastModifiedDateOfTheCurrentIterationFile').output.itemName
    8. Once all the ForEach iterations are completed, at the end the two set variables will have the latest file name and last modified date
    9. Then outside of ForEach, have a subsequent Copy activtiy -> copyLatestFileToDestination - In the source settings of your dataset pass the variable varLatestFileName value to the file name field.

    Here is the demonstration GIF:

    86149-getlastmodifiedfile.gif

    Hope this helps. Do let us know if you have any query.

    ----------

    Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful