question

Pm-3513 avatar image
0 Votes"
Pm-3513 asked MartinJaffer-MSFT commented

How to Use Wildcard in Exists Data Flow Activity

My objective is to use the 'Exists' data flow activity to check if the data I'm processing already exists in a directory in Azure Data Lake Storage. The issue I'm having is I'm wanting to access data within subdirectories. In the past, I've used a double wildcard (**) to get to data in all subdirectories, but it doesn't seem to be working in this case.

All of my images will be provided below. I've provided a screenshot of my current data flow, source activity, and exists activity, and the error I receive.

My top directory is 2021. Subdirectories include months and days, where all data is stored. In my source activity, I placed the double wildcard, but when I previewed the data, I received the error you see below, and it appears as though it is not seeing any of the data in the subdirectories.

If anyone has any thoughts/ideas on this, it would be much appreciated. Thank you.



98300-pipeline.jpg

98309-exists.jpg

98285-source.jpg

98250-error.jpg


azure-data-factoryazure-synapse-analyticsazure-data-lake-storage
pipeline.jpg (139.8 KiB)
exists.jpg (239.4 KiB)
source.jpg (120.4 KiB)
error.jpg (6.5 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Note: The screenshot of my data flow shows that I'm using SQL as a datasource; this is a different screenshot. In reality I'm using a dataset from ADLS as my datasource.

0 Votes 0 ·

1 Answer

OmarCSiado avatar image
1 Vote"
OmarCSiado answered MartinJaffer-MSFT commented

Hi,

I think the following link could provide you insights for your requirement
https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage#source-transformation

Please take a look at the examples:
Wildcard examples:

  • Represents any set of characters.

  • Represents recursive directory nesting.

? Replaces one character.

[] Matches one or more characters in the brackets.

/data/sales//.csv Gets all .csv files under /data/sales.*

/data/sales/20??// Gets all files in the 20th century.**

/data/sales///*.csv Gets .csv files two levels under /data/sales.

/data/sales/2004/*/12/[XY]1?.csv Gets all .csv files in December 2004 starting with X or Y prefixed by a two-digit number.


Please let us know if the suggestion works for you.

Regards,

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

This worked perfectly. Thank you for your input!

1 Vote 1 ·

Thank you for helping, @OmarCSiado . Please keep up the good work!

1 Vote 1 ·