question

ShivendooKumar-8455 avatar image
0 Votes"
ShivendooKumar-8455 asked SaurabhSharma-msft commented

While Creating External Table in Synapse Can I use wild card in Location?

Hi All,

Working on a project where I have to bring data from ADLS to Synapse. I am planning to use Polybase so that I can create external tables based on CSV files from ADLS and avoid physical data movement to save time and money.

On ADLS, I may have multiple files with the same name but with a date suffix like Sales_01_Oct_2021.CSV, Sales_02_Oct_2021.CSV etc. These files are having one-day data in them. I want to create an external table called External.Sales and this table should have data from all the files starting with Sales. Is this possible..?

https://towardsdatascience.com/loading-csv-data-into-azure-synapse-analytics-by-using-polybase-5ae942ce3059
-- Create a temp table to hold the imported data
CREATE EXTERNAL TABLE dbo.FIPSLOOKUP_EXT (
UID INT NOT NULL,
iso2 VARCHAR(2) NULL,
iso3 VARCHAR(3) NULL,
code3 INT NULL,
FIPS INT NULL,
Admin2 VARCHAR(255) NULL,
provincestate VARCHAR(255) NULL,
countryregion VARCHAR(255) NULL,
latitude DECIMAL(12,9) NULL,
longitude DECIMAL(12,9) NULL,
combined_key VARCHAR(255) NULL,
population INT NULL
)
WITH (
LOCATION='../Sales_',*
DATA_SOURCE=AzureStorage,
FILE_FORMAT=csvFile
);


azure-synapse-analytics
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

ShivendooKumar-8455 avatar image
0 Votes"
ShivendooKumar-8455 answered SaurabhSharma-msft commented

I found the answer here:
https://stackoverflow.com/questions/54543064/create-view-in-polybase

Since they are all targeting the same table, you don't need to use this file-by-file method. Polybase will load a wildcard-like set of files. If you put all your data files of the same type in a folder and set the FOLDER as the location, all the files in the folder will be loaded in parallel. You will find your loads hugely faster … at the moment you're processing files in series, using the method I suggest will process them in parallel as fast as readers are available.

The only extra step I need to do here is to create a separate folder for each table.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@shivendookumar-8455 Good to hear and thanks for sharing your solution to the community.

Thanks
Saurabh

0 Votes 0 ·