批量访问 Azure Blob 存储中数据的示例Examples of Bulk Access to Data in Azure Blob Storage

适用对象:是SQL Server 是Azure SQL 数据库 否Azure Synapse Analytics (SQL DW) 否并行数据仓库 APPLIES TO: yesSQL Server yesAzure SQL Database noAzure Synapse Analytics (SQL DW) noParallel Data Warehouse

BULK INSERTOPENROWSET 语句可以直接访问 Azure Blob 存储中的文件。The BULK INSERT and OPENROWSET statements can directly access a file in Azure blob storage. 下面的示例使用 CSV(逗号分隔值)文件(名为 inv-2017-01-19.csv)中的数据,该文件存储在容器(名为 Week3)中,该容器存储在存储帐户(名为 newinvoices)中。The following examples use data from a CSV (comma separated value) file (named inv-2017-01-19.csv), stored in a container (named Week3), stored in a storage account (named newinvoices). 可以使用格式文件的路径,但这些示例不包含该路径。The path to format file can be used, but is not included in these examples.

从 SQL Server 批量访问 Azure Blob 存储至少需要 SQL Server 2017 (14.x)SQL Server 2017 (14.x) CTP 版本 1.1。Bulk access to Azure blob storage from SQL Server, requires at least SQL Server 2017 (14.x)SQL Server 2017 (14.x) CTP 1.1.

重要

blob 上容器和文件的所有路径都是 CASE SENSITIVEAll the paths to the container and to the files on blob are CASE SENSITIVE. 如果不正确,它可能会返回错误,如“无法进行大容量加载。If not correct, it might return error like "Cannot bulk load. 文件 "file.csv" 不存在或你没有文件访问权限。”The file "file.csv" does not exist or you don't have file access rights."

创建凭据Create the credential

下面的所有示例都需要一个引用共享访问签名的数据库范围凭据。All of the examples below require a database scoped credential referencing a shared access signature.

重要

必须借助一个使用 SHARED ACCESS SIGNATURE 标识的数据库范围凭据创建外部数据源。The external data source must be created with a database scoped credential that uses the SHARED ACCESS SIGNATURE identity. 若要为存储帐户创建共享访问签名,请查看 Azure 门户中存储帐户属性页上的“共享访问签名”属性。 To create a shared access signature for your storage account, see the Shared access signature property on the storage account property page, in the Azure portal. 有关共享访问签名的详细信息,请参阅使用共享访问签名 (SAS)For more information on shared access signatures, see Using Shared Access Signatures (SAS). 有关凭据的详细信息,请参阅 CREATE DATABASE SCOPED CREDENTIALFor more information on credentials, see CREATE DATABASE SCOPED CREDENTIAL.

使用必须属于 SHARED ACCESS SIGNATUREIDENTITY 创建数据库范围凭据。Create a database scoped credential using the IDENTITY which must be SHARED ACCESS SIGNATURE. 使用为 blob 存储帐户生成的 SAS 令牌。Use the SAS token generated for the blob storage account. 验证 SAS 令牌是否没有前导 ?、你是否至少对应加载的对象拥有读取权限以及有效期是否有效(所有日期都采用 UTC 时间)。Verify that your SAS token does not have a leading ?, that you have at least read permission on the object that should be loaded, and that the expiration period is valid (all dates are in UTC time).

例如:For example:

CREATE DATABASE SCOPED CREDENTIAL UploadInvoices
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2018-03-28&ss=b&srt=sco&sp=rwdlac&se=2019-08-31T02:25:19Z&st=2019-07-30T18:25:19Z&spr=https&sig=KS51p%2BVnfUtLjMZtUTW1siyuyd2nlx294tL0mnmFsOk%3D';

访问引用 Azure Blob 存储位置的 CSV 文件中的数据Accessing data in a CSV file referencing an Azure blob storage location

下面的示例使用指向名为 MyAzureInvoices 的 Azure 存储帐户的外部数据源。The following example uses an external data source pointing to an Azure storage account, named MyAzureInvoices.

CREATE EXTERNAL DATA SOURCE MyAzureInvoices
    WITH (
        TYPE = BLOB_STORAGE,
        LOCATION = 'https://newinvoices.blob.core.windows.net',
        CREDENTIAL = UploadInvoices
    );

然后,OPENROWSET 语句将容器名称 (week3) 添加到文件说明。Then the OPENROWSET statement adds the container name (week3) to the file description. 该文件命名为 inv-2017-01-19.csvThe file is named inv-2017-01-19.csv.

SELECT * FROM OPENROWSET(
   BULK 'week3/inv-2017-01-19.csv',
   DATA_SOURCE = 'MyAzureInvoices',
   FORMAT = 'CSV',
   FORMATFILE='invoices.fmt',
   FORMATFILE_DATA_SOURCE = 'MyAzureInvoices'
   ) AS DataFile;   

使用 BULK INSERT 时,请使用容器和文件说明:Using BULK INSERT, use the container and file description:

BULK INSERT Colors2
FROM 'week3/inv-2017-01-19.csv'
WITH (DATA_SOURCE = 'MyAzureInvoices',
      FORMAT = 'CSV');

访问引用 Azure Blob 存储位置中某个容器的 CSV 文件中的数据Accessing data in a CSV file referencing a container in an Azure blob storage location

下面的示例使用指向 Azure 存储帐户中某个容器(名为 week3)的外部数据源。The following example uses an external data source pointing to a container (named week3) in an Azure storage account.

CREATE EXTERNAL DATA SOURCE MyAzureInvoicesContainer
    WITH (
        TYPE = BLOB_STORAGE,
        LOCATION = 'https://newinvoices.blob.core.windows.net/week3',
        CREDENTIAL = UploadInvoices
    );

OPENROWSET 语句不会在文件说明中包含该容器名称:Then the OPENROWSET statement does not include the container name in the file description:

SELECT * FROM OPENROWSET(
   BULK 'inv-2017-01-19.csv',
   DATA_SOURCE = 'MyAzureInvoicesContainer',
   FORMAT = 'CSV',
   FORMATFILE='invoices.fmt',
   FORMATFILE_DATA_SOURCE = 'MyAzureInvoices'
   ) AS DataFile;

使用 BULK INSERT 时,请不要在文件说明中使用该容器名称:Using BULK INSERT, do not use the container name in the file description:

BULK INSERT Colors2
FROM 'inv-2017-01-19.csv'
WITH (DATA_SOURCE = 'MyAzureInvoicesContainer',
      FORMAT = 'CSV');

另请参阅See Also