Quickstart: Bulk loading with Synapse SQL
Loading data has never been easier when using the Bulk Load wizard in the Synapse Studio. This wizard will guide you through creating a T-SQL script with the COPY statement to bulk load data.
Entry points to the Bulk Load wizard
You can now easily bulk load data using dedicated SQL pools with a simple right-click on the following areas within the Synapse Studio:
- A file or folder from an Azure storage account attached to your workspace
This wizard generates a COPY statement, which uses Azure AD pass-through for authentication. Your Azure AD user must have access to the workspace with at least the Storage Blob Data Contributor Azure role to the ADLS Gen2 Account.
You must have the required permissions to use the COPY statement and Create table permissions if you're creating a new table to load to.
The linked service associated with the ADLS Gen2 Account must have access to the file/folder to load. For example, if the linked service authentication mechanism is Managed Identity, the workspace managed identity must have at least Storage blob reader permission on the storage account.
If VNet is enabled on your workspace, make sure the integrated runtime associated to the ADLS Gen2 Account linked services for the source data and error file location has interactive authoring enabled. Interactive authoring is required for autoschema detection, previewing the source file contents, and browsing ADLS Gen2 storage accounts within the wizard.
Select the storage account and the file or folder you're loading from on the Source storage location panel. The wizard will automatically attempt to detect Parquet files. If the Parquet file type cannot be confirmed, delimited text (CSV) will be used by default.
Select the file format settings including the storage account where you want to write rejected rows (error file). Currently only CSV and Parquet files are supported.
You can select "Preview data" to see how the COPY statement will parse the file to help you configure the file format settings. Select "Preview data" every time you change a file format setting to see how the COPY statement will parse the file with the updated setting:
- Previewing the data with multi-character field terminators is not support in the bulk load wizard. The bulk load wizard will preview the data within a single column when a multi-character field terminator is specified.
- Specifying multi-character row terminators is supported in the COPY statement; however, this is not supported in the bulk load wizard where an error willl be thrown.
Select the dedicated SQL pool you're using to load including whether the load will be for an existing table or new table:
Select "Configure column mapping" to make sure you have the appropriate column mapping. Note columns names will automatically be detected if "Infer column names" was enabled. For new tables, configuring the column mapping is critical for updating the target column data types:
Select "Open script" and a T-SQL script will be generated with the COPY statement to load from your data lake: