question

Vinay5-0499 avatar image
0 Votes"
Vinay5-0499 asked MartinJaffer-MSFT commented

How to process only the failed files in foreach activity when the master pipeline is retriggered.

Hello,



There are two pipeline Master and child.
In the child pipeline there is a foreach activity which takes files as input and process them in parallel.
For instance, there are 4 files , in which 2 files are successfully processed and loaded the data into a table. Then, files 3 processing is failed and file 4 is successful. Now, when I retrigger the Master pipeline I only want the 3rd file to be processed, not all the 4 files.
How can we achieve this.

Could someone please assist


Thank you.

azure-data-factory
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@KranthiPakala-MSFT Could you please assist.

0 Votes 0 ·

Hello @Vinay5-0499 and welcome to Microsoft Q&A.

I can't think of a way to restrict the file processed when rerunning a failed pipeline.

However I can think of ways to record and persist a listing of the failed files so they can be run again later.

The difference here is whether we are re-running a failed pipeline run, or running a new separate pipeline run.

Is this an acceptable alternative?

0 Votes 0 ·

Actually, I had another thought, not 100% sure would work as rerunning failed runs.

Suppose when the copy activity fails, it was followed by another activity which records the name of the failed file.

Also suppose in the beginning of the pipeline we have a lookup which searches for records of failed file (mentioned above). If any are found, it then tries copying those. If none are found, then the pipeline proceeds as usual, copying everything you usually would.

So, in this way, the first try finds no failed records, so it does normal copy. When the copy fails, it writes down the failures.
In the second try, it finds the recorded failures, and copies those instead of the normal copy.

For this to work, the pipeline would need to run from beginning, not resume from middle.

I have not done a detailed study on what gets persisted during re-runs, that is why I am uncertain.

0 Votes 0 ·

I have not heard back form you @Vinay5-0499 . If you found your own solution, could you please share it here with the community?

0 Votes 0 ·

0 Answers