CREATE EXTERNAL DATA SOURCE (Transact-SQL)CREATE EXTERNAL DATA SOURCE (Transact-SQL)

建立使用 SQL Server、SQL Database、Azure Synapse Analytics 或 Analytics Platform System (平行處理資料倉儲或 PDW) 進行查詢的外部資料來源。Creates an external data source for querying using SQL Server, SQL Database, Azure Synapse Analytics, or Analytics Platform System (Parallel Data Warehouse or PDW).

本文提供適用於您所選擇之 SQL 產品的語法、引數、備註、權限和範例。This article provides the syntax, arguments, remarks, permissions, and examples for whichever SQL product you choose.

如需語法慣例的詳細資訊,請參閱 Transact-SQL 語法慣例For more information about the syntax conventions, see Transact-SQL Syntax Conventions.

選取產品Select a product

在以下資料列中,選取您感興趣的產品名稱,隨即只會顯示該產品的資訊。In the following row, select the product name you're interested in, and only that product’s information is displayed.

* SQL Server *  * SQL Server *  

 

概觀:SQL ServerOverview: SQL Server

建立 PolyBase 查詢的外部資料來源。Creates an external data source for PolyBase queries. 外部資料來源會用來建立連線能力,並支援這些主要使用案例:External data sources are used to establish connectivity and support these primary use cases:

  • 使用 PolyBase 來執行資料虛擬化和資料載入Data virtualization and data load using PolyBase
  • 使用 BULK INSERTOPENROWSET 的大量載入作業Bulk load operations using BULK INSERT or OPENROWSET

適用於:從 SQL Server 2016 (13.x) 開始Applies to: Starting with SQL Server 2016 (13.x)

語法Syntax

CREATE EXTERNAL DATA SOURCE <data_source_name>
WITH
  ( [ LOCATION = '<prefix>://<path>[:<port>]' ]
    [ [ , ] CONNECTION_OPTIONS = '<name_value_pairs>']
    [ [ , ] CREDENTIAL = <credential_name> ]
    [ [ , ] PUSHDOWN = { ON | OFF } ]
    [ [ , ] TYPE = { HADOOP | BLOB_STORAGE } ]
    [ [ , ] RESOURCE_MANAGER_LOCATION = '<resource_manager>[:<port>]' )
[ ; ]

引數Arguments

data_source_namedata_source_name

指定資料來源的使用者定義名稱。Specifies the user-defined name for the data source. 這個名稱在 SQL ServerSQL Server 資料庫內必須是唯一的。The name must be unique within the database in SQL ServerSQL Server.

LOCATION = '<prefix>://<path[:port]>'LOCATION = '<prefix>://<path[:port]>'

提供連線通訊協定和路徑給外部資料來源。Provides the connectivity protocol and path to the external data source.

外部資料來源External Data Source 位置前置詞Location prefix 位置路徑Location path 支援的位置 (依產品/服務)Supported locations by product / service
Cloudera 或 HortonworksCloudera or Hortonworks hdfs <Namenode>[:port] 從 SQL Server 2016 (13.x) 開始Starting with SQL Server 2016 (13.x)
Azure 儲存體帳戶 (V2)Azure Storage account(V2) wasb[s] <container>@<storage_account>.blob.core.windows.net 從 SQL Server 2016 (13.x) 開始 支援階層命名空間Starting with SQL Server 2016 (13.x) Hierarchical Namespace not supported
SQL ServerSQL Server sqlserver <server_name>[\<instance_name>][:port] SQL Server 2019 (15.x)SQL Server 2019 (15.x) 開始Starting with SQL Server 2019 (15.x)SQL Server 2019 (15.x)
OracleOracle oracle <server_name>[:port] SQL Server 2019 (15.x)SQL Server 2019 (15.x) 開始Starting with SQL Server 2019 (15.x)SQL Server 2019 (15.x)
TeradataTeradata teradata <server_name>[:port] SQL Server 2019 (15.x)SQL Server 2019 (15.x) 開始Starting with SQL Server 2019 (15.x)SQL Server 2019 (15.x)
MongoDB 或 CosmosDBMongoDB or CosmosDB mongodb <server_name>[:port] SQL Server 2019 (15.x)SQL Server 2019 (15.x) 開始Starting with SQL Server 2019 (15.x)SQL Server 2019 (15.x)
ODBCODBC odbc <server_name>[:port] SQL Server 2019 (15.x)SQL Server 2019 (15.x) 開始 - 僅限 WindowsStarting with SQL Server 2019 (15.x)SQL Server 2019 (15.x) - Windows only
大量作業Bulk Operations https <storage_account>.blob.core.windows.net/<container> SQL Server 2017 (14.x)SQL Server 2017 (14.x) 開始Starting with SQL Server 2017 (14.x)SQL Server 2017 (14.x)
Edge 中樞Edge Hub edgehub 不適用Not Applicable EdgeHub 對於 Azure SQL Edge 執行個體一律是本機的。EdgeHub is always local to the instance of Azure SQL Edge. 因此,不需要指定路徑或連接埠值。As such there is no need to specify a path or port value. 僅適用於 Azure SQL Edge。Only available in Azure SQL Edge.
KafkaKafka kafka <Kafka IP Address>[:port] 僅適用於 Azure SQL Edge。Only available in Azure SQL Edge.

位置路徑:Location path:

  • <Namenode> = Hadoop 叢集中 Namenode 的電腦名稱、名稱服務 URI 或 IP 位置。<Namenode> = the machine name, name service URI, or IP address of the Namenode in the Hadoop cluster. PolyBase 必須解析 Hadoop 叢集所使用的任何 DNS 名稱。PolyBase must resolve any DNS names used by the Hadoop cluster.
  • port = 外部資料來源正在接聽的連接埠。port = The port that the external data source is listening on. 在 Hadoop 中,此連接埠可使用 fs.defaultFS 設定參數來尋找。In Hadoop, the port can be found using the fs.defaultFS configuration parameter. 預設值為 8020。The default is 8020.
  • <container> = 保留資料的儲存體帳戶容器。<container> = the container of the storage account holding the data. 根容器是唯讀的,因此無法將資料寫回至容器。Root containers are read-only, data can't be written back to the container.
  • <storage_account> = Azure 資源的儲存體帳戶名稱。<storage_account> = the storage account name of the Azure resource.
  • <server_name> = 主機名稱。<server_name> = the host name.
  • <instance_name> = SQL Server 具名執行個體的名稱。<instance_name> = the name of the SQL Server named instance. 在目標執行個體上執行 SQL Server Browser 服務時使用。Used if you have SQL Server Browser Service running on the target instance.

設定位置時的其他注意事項和指引:Additional notes and guidance when setting the location:

  • 建立物件時,SQL Server Database EngineSQL Server Database Engine 不會驗證外部資料來源是否存在。The SQL Server Database EngineSQL Server Database Engine doesn't verify the existence of the external data source when the object is created. 若要驗證,請使用外部資料來源建立外部資料表。To validate, create an external table using the external data source.
  • 查詢 Hadoop 時,請針對所有資料表使用相同的外部資料來源,以確保查詢語意一致。Use the same external data source for all tables when querying Hadoop to ensure consistent querying semantics.
  • 您可以使用 sqlserver 位置前置詞,將 SQL Server 2019 (15.x)SQL Server 2019 (15.x) 連接到另一個 SQL ServerSQL ServerAzure SQL DatabaseAzure SQL Database 或 Azure Synapse Analytics。You can use the sqlserver location prefix to connect SQL Server 2019 (15.x)SQL Server 2019 (15.x) to another SQL ServerSQL Server, to Azure SQL DatabaseAzure SQL Database, or to Azure Synapse Analytics.
  • 透過 ODBC 連線時,請指定 Driver={<Name of Driver>}Specify the Driver={<Name of Driver>} when connecting via ODBC.
  • 針對存取 Azure 儲存體帳戶,wasbs 是選擇性的,但由於資料會使用安全的 TLS/SSL 連線傳送,因此建議使用。wasbs is optional but recommended for accessing Azure Storage Accounts as data will be sent using a secure TLS/SSL connection.
  • 存取 Azure 儲存體帳戶時,不支援 abfsabfss API。abfs or abfss APIs are not supported when accessing Azure Storage Accounts.
  • 不支援 Azure 儲存體帳戶 (V2) 的 [階層式命名空間] 選項。The Hierarchical Namespace option for Azure Storage Accounts(V2) is not supported. 請確認此選項保持 停用Please ensure that this option remains disabled.
  • 為確保在 Hadoop Namenode 容錯移轉期間能成功進行 PolyBase 查詢,請考慮使用虛擬 IP 位址作為 Hadoop 叢集的 NamenodeTo ensure successful PolyBase queries during a Hadoop Namenode fail-over, consider using a virtual IP address for the Namenode of the Hadoop cluster. 若未這樣做,請執行 ALTER EXTERNAL DATA SOURCE 命令以指向新位置。If you don't, execute an ALTER EXTERNAL DATA SOURCE command to point to the new location.

CONNECTION_OPTIONS = key_value_pairCONNECTION_OPTIONS = key_value_pair

透過 ODBC 連線到外部資料來源時,請指定其他選項。Specifies additional options when connecting over ODBC to an external data source. 若要使用多個連線選項,請以分號進行分隔。To use multiple connection options, separate them by a semi-colon.

至少需要驅動程式的名稱,但還有其他選項 (例如 APP='<your_application_name>'ApplicationIntent= ReadOnly|ReadWrite) 若加以設定也會很有用,並可協助進行疑難排解。The name of the driver is required as a minimum, but there are other options such as APP='<your_application_name>' or ApplicationIntent= ReadOnly|ReadWrite that are also useful to set and can assist with troubleshooting.

如需允許的 CONNECTION_OPTIONS 清單,請參閱 ODBC 產品文件Refer to the ODBC product documentation for a list of permitted CONNECTION_OPTIONS

PUSHDOWN = ON | OFFPUSHDOWN = ON | OFF

指出是否可以將計算下推到外部資料來源。States whether computation can be pushed down to the external data source. 預設為開啟。It is on by default.

在外部資料來源層級連線到 SQL Server、Oracle、Teradata、MongoDB 或 ODBC 時,才支援 PUSHDOWNPUSHDOWN is supported when connecting to SQL Server, Oracle, Teradata, MongoDB, or ODBC at the external data source level.

透過提示可啟用或停用查詢層級的下推。Enabling or disabling push-down at the query level is achieved through a hint.

CREDENTIAL = credential_nameCREDENTIAL = credential_name

指定資料庫範圍的認證,以便向外部資料來源進行驗證。Specifies a database-scoped credential for authenticating to the external data source.

建立認證時的其他注意事項和指引:Additional notes and guidance when creating a credential:

  • 只有在資料受到保護時才需要 CREDENTIALCREDENTIAL is only required if the data has been secured. 允許匿名存取的資料集不需要 CREDENTIALCREDENTIAL isn't required for data sets that allow anonymous access.
  • TYPE = BLOB_STORAGE 時,必須使用 SHARED ACCESS SIGNATURE 作為身分識別來建立認證。When TYPE = BLOB_STORAGE the credential must be created using SHARED ACCESS SIGNATURE as the identity. 此外,SAS 權杖應設定如下:Furthermore, the SAS token should be configured as follows:
    • 設定為祕密時,請排除前置 ?Exclude the leading ? when configured as the secret
    • 至少擁有應載入檔案的讀取權限 (例如 srt=o&sp=r)Have at least read permission on the file that should be loaded (for example srt=o&sp=r)
    • 使用有效的到期時間 (所有日期都是 UTC 時間)。Use a valid expiration period (all dates are in UTC time).

如需搭配 SHARED ACCESS SIGNATURETYPE = BLOB_STORAGE 使用 CREDENTIAL 的範例,請參閱建立外部資料來源以執行大量作業並將資料從 Azure 儲存體擷取到 SQL DatabaseFor an example of using a CREDENTIAL with SHARED ACCESS SIGNATURE and TYPE = BLOB_STORAGE, see Create an external data source to execute bulk operations and retrieve data from Azure Storage into SQL Database

若要建立資料庫範圍認證,請參閱 CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)To create a database scoped credential, see CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL).

TYPE = [ HADOOP | BLOB_STORAGE ]TYPE = [ HADOOP | BLOB_STORAGE ]

指定要設定的外部資料來源類型。Specifies the type of the external data source being configured. 不一定需要此參數。This parameter isn't always required.

  • 當外部資料來源為 Cloudera、Hortonworks 或 Azure 儲存體帳戶時,請使用 HADOOP。Use HADOOP when the external data source is Cloudera, Hortonworks, or an Azure Storage account.
  • 搭配 SQL Server 2017 (14.x)SQL Server 2017 (14.x) 使用 BULK INSERTOPENROWSET 執行來自 Azure 儲存體帳戶的大量作業時,請使用 BLOB_STORAGE。Use BLOB_STORAGE when executing bulk operations from Azure Storage account using BULK INSERT, or OPENROWSET with SQL Server 2017 (14.x)SQL Server 2017 (14.x).

重要

如果使用任何其他外部資料來源,請不要設定 TYPEDo not set TYPE if using any other external data source.

如需使用 TYPE = HADOOP 從 Azure 儲存體帳戶載入資料的範例,請參閱使用 wasb:// 介面建立存取 Azure 儲存體中資料的外部資料來源For an example of using TYPE = HADOOP to load data from an Azure Storage account, see Create external data source to access data in Azure Storage using the wasb:// interface

RESOURCE_MANAGER_LOCATION = 'ResourceManager_URI[:port]'RESOURCE_MANAGER_LOCATION = 'ResourceManager_URI[:port]'

連線到 Hortonworks 或 Cloudera 時,請設定這個選用值。Configure this optional value when connecting to Hortonworks or Cloudera.

定義 RESOURCE_MANAGER_LOCATION 時,查詢最佳化工具會制訂成本型決策以改善效能。When the RESOURCE_MANAGER_LOCATION is defined, the Query Optimizer will make a cost-based decision to improve performance. MapReduce 作業可用於將計算下推到 Hadoop。A MapReduce job can be used to push down the computation to Hadoop. 指定 RESOURCE_MANAGER_LOCATION 可大幅降低 Hadoop 與 SQL ServerSQL Server 之間的資料傳輸量,這可改善查詢效能。Specifying the RESOURCE_MANAGER_LOCATION can significantly reduce the volume of data transferred between Hadoop and SQL ServerSQL Server, which can lead to improved query performance.

若未指定 Resource Manager,系統會針對 PolyBase 查詢停用將計算推送到 Hadoop。If the Resource Manager isn't specified, pushing compute to Hadoop is disabled for PolyBase queries.

若未指定連接埠,系統會使用 'hadoop connectivity' 設定的目前設定選擇預設值。If the port isn't specified, the default value is chosen using the current setting for 'hadoop connectivity' configuration.

Hadoop 連線能力Hadoop Connectivity 預設資源管理員連接埠Default Resource Manager Port
11 5030050300
22 5030050300
33 80218021
44 80328032
55 80508050
66 80328032
77 80508050

如需支援 Hadoop 版本的完整清單,請參閱 PolyBase 連線設定 (Transact-SQL)For a complete list of supported Hadoop versions, see PolyBase Connectivity Configuration (Transact-SQL).

重要

當您建立外部資料來源時,不會驗證 RESOURCE_MANAGER_LOCATION 值。The RESOURCE_MANAGER_LOCATION value is not validated when you create the external data source. 輸入不正確的值可能會導致每次嘗試下推就在執行時間發生查詢失敗,因為無法解析所提供的值。Entering an incorrect value may cause query failure at execution time whenever push-down is attempted as the provided value would not be able to resolve.

在已啟用下推的情況下,建立參考 Hadoop 的外部資料來源提供具體範例及進一步指引。Create external data source to reference Hadoop with push-down enabled provides a concrete example and further guidance.

權限Permissions

需要有 SQL ServerSQL Server 資料庫的 CONTROL 權限。Requires CONTROL permission on database in SQL ServerSQL Server.

鎖定Locking

EXTERNAL DATA SOURCE 物件採取共用鎖定。Takes a shared lock on the EXTERNAL DATA SOURCE object.

安全性Security

PolyBase 支援大多數外部資料來源的 Proxy 驗證。PolyBase supports proxy based authentication for most external data sources. 建立資料庫範圍認證以建立 Proxy 帳戶。Create a database scoped credential to create the proxy account.

當您連線到 SQL Server 巨量資料叢集中的儲存體或資料集區時,系統會將使用者的認證傳遞到後端系統。When you connect to the storage or data pool in a SQL Server big data cluster, the user's credentials are passed through to the back-end system. 在資料集區本身中建立登入以啟用傳遞驗證。Create logins in the data pool itself to enable pass through authentication.

目前不支援 HADOOP 類型的 SAS 權杖。Currently a SAS token with type HADOOP is unsupported. 只支援搭配儲存體帳戶存取金鑰使用。It's only supported with a storage account access key. 嘗試使用 HADOOP 類型及 SAS 認證來建立外部資料來源時會失敗,並發生下列錯誤:Attempting to create an external data source with type HADOOP and a SAS credential fails with the following error:

Msg 105019, Level 16, State 1 - EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect. Java exception message: Parameters provided to connect to the Azure storage account are not valid.: Error [Parameters provided to connect to the Azure storage account are not valid.] occurred while accessing external file.'

範例 (自 SQL Server 2016 (13.x) 起)Examples (Starting with SQL Server 2016 (13.x))

重要

如需如何安裝及啟用 Polybase 的相關資訊,請參閱在 Windows 上安裝 PolyBaseFor information on how to install and enable PolyBase, see Install PolyBase on Windows

A.A. 在 SQL Server 2019 中建立參考 Oracle 的外部資料來源Create external data source in SQL Server 2019 to reference Oracle

若要建立參考 Oracle 的外部資料來源,請確保您擁有資料庫範圍認證。To create an external data source that references Oracle, ensure you have a database scoped credential. 您也可以選擇啟用或停用對此資料來源的下推計算。You may optionally also enable or disable push-down of computation against this data source.

-- Create a database master key if one does not already exist, using your own password. This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '!MyC0mpl3xP@ssw0rd!' ;

-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL OracleProxyAccount
WITH
     IDENTITY = 'oracle_username',
     SECRET = 'oracle_password' ;

CREATE EXTERNAL DATA SOURCE MyOracleServer
WITH
  ( LOCATION = 'oracle://145.145.145.145:1521',
    CREDENTIAL = OracleProxyAccount,
    PUSHDOWN = ON
  ) ;

如需 MongoDB 等其他資料來源的範例,請參閱設定 PolyBase 存取 MongoDB 中的外部資料For additional examples to other data sources such as MongoDB, see Configure PolyBase to access external data in MongoDB

B.B. 建立參考 Hadoop 的外部資料來源Create external data source to reference Hadoop

若要建立參考 Hortonworks 或 Cloudera Hadoop 叢集的外部資料來源,請指定 Hadoop Namenode 的電腦名稱或 IP 位址與連接埠。To create an external data source to reference your Hortonworks or Cloudera Hadoop cluster, specify the machine name, or IP address of the Hadoop Namenode and port.

CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH
  ( LOCATION = 'hdfs://10.10.10.10:8050' ,
    TYPE = HADOOP
  ) ;

C.C. 在已啟用下推的情況下,建立參考 Hadoop 的外部資料來源Create external data source to reference Hadoop with push-down enabled

指定 RESOURCE_MANAGER_LOCATION 選項,以便對適用於 PolyBase 查詢的 Hadoop 啟用下推計算。Specify the RESOURCE_MANAGER_LOCATION option to enable push-down computation to Hadoop for PolyBase queries. 啟用之後,PolyBase 會制訂成本型決策來判斷是否應該將查詢計算推送到 Hadoop。Once enabled, PolyBase makes a cost-based decision to determine whether the query computation should be pushed to Hadoop.

CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH
  ( LOCATION = 'hdfs://10.10.10.10:8020' ,
    TYPE = HADOOP ,
    RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
  ) ;

D.D. 建立參考由 Kerberos 保護之 Hadoop 的外部資料來源Create external data source to reference Kerberos-secured Hadoop

若要確認 Hadoop 叢集是否由 Kerberos 保護,請檢查 Hadoop core-site.xml 中 hadoop.security.authentication 屬性的值。To verify if the Hadoop cluster is Kerberos-secured, check the value of hadoop.security.authentication property in Hadoop core-site.xml. 若要參考由 Kerberos 保護的 Hadoop 叢集,您必須指定資料庫範圍的認證,其中包含您的 Kerberos 使用者名稱與密碼。To reference a Kerberos-secured Hadoop cluster, you must specify a database scoped credential that contains your Kerberos username and password. 資料庫主要金鑰用來加密資料庫範圍的認證密碼。The database master key is used to encrypt the database scoped credential secret.

-- Create a database master key if one does not already exist, using your own password. This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'S0me!nfo' ;

-- Create a database scoped credential with Kerberos user name and password.
CREATE DATABASE SCOPED CREDENTIAL HadoopUser1
WITH
     IDENTITY = '<hadoop_user_name>',
     SECRET = '<hadoop_password>' ;

-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH
  ( LOCATION = 'hdfs://10.10.10.10:8050' ,
    CREDENTIAL = HadoopUser1 ,
    TYPE = HADOOP ,
    RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
  );

E.E. 使用 wasb:// 介面建立存取 Azure 儲存體中資料的外部資料來源Create external data source to access data in Azure Storage using the wasb:// interface

在此範例中,外部資料來源是名為 logs 的 Azure V2 儲存體帳戶。In this example, the external data source is an Azure V2 Storage account named logs. 容器名稱為 dailyThe container is called daily. Azure 儲存體外部資料來源僅供資料傳輸使用。The Azure Storage external data source is for data transfer only. 不支援述詞下推。It doesn't support predicate push-down. 透過 wasb:// 介面存取資料時,不支援階層命名空間。Hierarchical namespaces are not supported when accessing data via the the wasb:// interface.

此範例示範如何建立資料庫範圍認證,以便向 Azure V2 儲存體帳戶進行驗證。This example shows how to create the database scoped credential for authentication to an Azure V2 Storage account. 在資料庫認證祕密中指定 Azure 儲存體帳戶金鑰。Specify the Azure Storage account key in the database credential secret. 您可以在資料庫範圍認證身分識別中指定任何字串,因為系統不會使用其向 Azure 儲存體進行驗證。You can specify any string in database scoped credential identity as it isn't used during authentication to Azure Storage.

-- Create a database master key if one does not already exist, using your own password. This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'S0me!nfo' ;

-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH
  IDENTITY = '<my_account>' ,
  SECRET = '<azure_storage_account_key>' ;

-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH
  ( LOCATION = 'wasbs://daily@logs.blob.core.windows.net/' ,
    CREDENTIAL = AzureStorageCredential ,
    TYPE = HADOOP
  ) ;

F.F. 透過 PolyBase 連線建立參考 SQL Server 具名執行個體的外部資料來源 (SQL Server 2019 (15.x)SQL Server 2019 (15.x))Create external data source to reference a SQL Server named instance via PolyBase connectivity (SQL Server 2019 (15.x)SQL Server 2019 (15.x))

若要建立參考 SQL ServerSQL Server 具名執行個體的外部資料來源,您可以使用 CONNECTION_OPTIONS 來指定執行個體名稱。To create an external data source that references a named instance of SQL ServerSQL Server, you can use CONNECTION_OPTIONS to specify the instance name. 在以下範例中,WINSQL2019 是主機名稱,而 SQL2019 是執行個體名稱。In below example, WINSQL2019 is the host name and SQL2019 is the instance name.

CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
  LOCATION = 'sqlserver://WINSQL2019' ,
  CONNECTION_OPTIONS = 'Server=%s\SQL2019' ,
  CREDENTIAL = SQLServerCredentials
) ;

或者,您也可以使用連接埠來連線到 SQL ServerSQL Server 執行個體。Alternatively, you can use a port to connect to a SQL ServerSQL Server instance.

CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
  LOCATION = 'sqlserver://WINSQL2019:58137' ,
  CREDENTIAL = SQLServerCredentials
) ;

G.G. 建立參考 Kafka 的外部資料來源Create external data source to reference Kafka

在此範例中,外部資料來源是 Kafak 伺服器,其 IP 位址為 xxx.xxx.xxx.xxx 且在連接埠 1900 上接聽。In this example, the external data source is a Kafak server with IP address xxx.xxx.xxx.xxx and listening on port 1900. Kafka 外部資料來源僅供資料串流使用,而且不支援述詞下推。The kafka external data source is only for data streaming and does not support predicate push down.

-- Create an External Data Source for Kafka
CREATE EXTERNAL DATA SOURCE MyKafkaServer WITH (
    LOCATION = 'kafka://xxx.xxx.xxx.xxx:1900'
)
go

H.H. 建立參考 EdgeHub 的外部資料來源Create external data source to reference EdgeHub

在此範例中,外部資料來源是在與 Azure SQL Edge 相同的邊緣裝置上執行的 EdgeHub。In this example, the external data source is a EdgeHub running on the same edge device as Azure SQL Edge. EdgeHub 外部資料來源僅供資料串流使用,而且不支援述詞下推。The edgeHub external data source is only for data streaming and does not support predicate push down.

-- Create an External Data Source for Kafka
CREATE EXTERNAL DATA SOURCE MyEdgeHub WITH (
    LOCATION = 'edgehub://'
)
go

範例:大量作業Examples: Bulk Operations

重要

設定大量作業的外部資料來源時,請不要在 LOCATION URL 的結尾處新增尾端 / 、檔案名稱或共用存取簽章參數。Do not add a trailing /, file name, or shared access signature parameters at the end of the LOCATION URL when configuring an external data source for bulk operations.

I.I. 針對從 Azure 儲存體擷取資料的大量作業,建立外部資料來源Create an external data source for bulk operations retrieving data from Azure Storage

適用於: SQL Server 2017 (14.x)SQL Server 2017 (14.x)Applies to: SQL Server 2017 (14.x)SQL Server 2017 (14.x). 針對使用 BULK INSERTOPENROWSET 的大量作業,請使用下列資料來源。Use the following data source for bulk operations using BULK INSERT or OPENROWSET. 認證必須將 SHARED ACCESS SIGNATURE 設定為身分識別、不得在 SAS 權杖中有前置 ?、必須至少擁有應載入檔案的讀取權限 (例如 srt=o&sp=r),且到期時間應該有效 (所有日期都是 UTC 時間)。The credential must set SHARED ACCESS SIGNATURE as the identity, mustn't have the leading ? in the SAS token, must have at least read permission on the file that should be loaded (for example srt=o&sp=r), and the expiration period should be valid (all dates are in UTC time). 如需共用存取簽章的詳細資訊,請參閱使用共用存取簽章 (SAS)For more information on shared access signatures, see Using Shared Access Signatures (SAS).

CREATE DATABASE SCOPED CREDENTIAL AccessAzureInvoices
WITH
  IDENTITY = 'SHARED ACCESS SIGNATURE',
  -- Remove ? from the beginning of the SAS token
  SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z***************' ;

CREATE EXTERNAL DATA SOURCE MyAzureInvoices
WITH
  ( LOCATION = 'https://newinvoices.blob.core.windows.net/week3' ,
    CREDENTIAL = AccessAzureInvoices ,
    TYPE = BLOB_STORAGE
  ) ;

若要查看使用中的這個範例,請參閱 BULK INSERT 範例。To see this example in use, see the BULK INSERT example.

另請參閱See Also

* SQL Database *  * SQL Database *  

 

概觀:Azure SQL DatabaseOverview: Azure SQL Database

建立彈性查詢的外部資料來源。Creates an external data source for elastic queries. 外部資料來源會用來建立連線能力,並支援這些主要使用案例:External data sources are used to establish connectivity and support these primary use cases:

  • 使用 BULK INSERTOPENROWSET 的大量載入作業Bulk load operations using BULK INSERT or OPENROWSET
  • 使用 SQL Database 搭配彈性查詢,查詢遠端 SQL Database 或 Azure Synapse 執行個體Query remote SQL Database or Azure Synapse instances using SQL Database with elastic query
  • 使用彈性查詢,查詢分區化 Azure SQL DatabaseQuery a sharded Azure SQL Database using elastic query

語法Syntax

CREATE EXTERNAL DATA SOURCE <data_source_name>
WITH
  ( [ LOCATION = '<prefix>://<path>[:<port>]' ]
    [ [ , ] CREDENTIAL = <credential_name> ]
    [ [ , ] TYPE = { BLOB_STORAGE | RDBMS | SHARD_MAP_MANAGER } ]
    [ [ , ] DATABASE_NAME = '<database_name>' ]
    [ [ , ] SHARD_MAP_NAME = '<shard_map_manager>' ] )
[ ; ]

引數Arguments

data_source_namedata_source_name

指定資料來源的使用者定義名稱。Specifies the user-defined name for the data source. 這個名稱在 SQL Database 的資料庫內必須是唯一的。The name must be unique within the database in SQL Database.

LOCATION = '<prefix>://<path[:port]>'LOCATION = '<prefix>://<path[:port]>'

提供連線通訊協定和路徑給外部資料來源。Provides the connectivity protocol and path to the external data source.

外部資料來源External Data Source 位置前置詞Location prefix 位置路徑Location path
大量作業Bulk Operations https <storage_account>.blob.core.windows.net/<container>
彈性查詢 (分區)Elastic Query (shard) 不需要Not required <shard_map_server_name>.database.windows.net
彈性查詢 (遠端)Elastic Query (remote) 不需要Not required <remote_server_name>.database.windows.net

位置路徑:Location path:

  • <shard_map_server_name> = Azure 中裝載分區對應管理員的邏輯伺服器名稱。<shard_map_server_name> = The logical server name in Azure that is hosting the shard map manager. DATABASE_NAME 引數提供用於裝載分區對應的資料庫,而 SHARD_MAP_NAME 則用於分區對應本身。The DATABASE_NAME argument provides the database used to host the shard map and SHARD_MAP_NAME is used for the shard map itself.
  • <remote_server_name> = 彈性查詢的目標邏輯伺服器名稱。<remote_server_name> = The target logical server name for the elastic query. 資料庫名稱則是使用 DATABASE_NAME 引數來指定。The database name is specified using the DATABASE_NAME argument.

設定位置時的其他注意事項和指引:Additional notes and guidance when setting the location:

  • 建立物件時,Database EngineDatabase Engine 不會驗證外部資料來源是否存在。The Database EngineDatabase Engine doesn't verify the existence of the external data source when the object is created. 若要驗證,請使用外部資料來源建立外部資料表。To validate, create an external table using the external data source.

CREDENTIAL = credential_nameCREDENTIAL = credential_name

指定資料庫範圍的認證,以便向外部資料來源進行驗證。Specifies a database-scoped credential for authenticating to the external data source.

建立認證時的其他注意事項和指引:Additional notes and guidance when creating a credential:

  • 若要從 Azure 儲存體將資料載入 Azure SQL DatabaseAzure SQL Database,請使用共用存取簽章 (SAS 權杖)。To load data from Azure Storage into Azure SQL DatabaseAzure SQL Database, use a Shared Access Signature (SAS token).
  • 只有在資料受到保護時才需要 CREDENTIALCREDENTIAL is only required if the data has been secured. 允許匿名存取的資料集不需要 CREDENTIALCREDENTIAL isn't required for data sets that allow anonymous access.
  • TYPE = BLOB_STORAGE 時,必須使用 SHARED ACCESS SIGNATURE 作為身分識別來建立認證。When the TYPE = BLOB_STORAGE the credential must be created using SHARED ACCESS SIGNATURE as the identity. 此外,SAS 權杖應設定如下:Furthermore, the SAS token should be configured as follows:
    • 設定為祕密時,請排除前置 ?Exclude the leading ? when configured as the secret
    • 至少擁有應載入檔案的讀取權限 (例如 srt=o&sp=r)Have at least read permission on the file that should be loaded (for example srt=o&sp=r)
    • 使用有效的到期時間 (所有日期都是 UTC 時間)。Use a valid expiration period (all dates are in UTC time).

如需搭配 SHARED ACCESS SIGNATURETYPE = BLOB_STORAGE 使用 CREDENTIAL 的範例,請參閱建立外部資料來源以執行大量作業並將資料從 Azure 儲存體擷取到 SQL DatabaseFor an example of using a CREDENTIAL with SHARED ACCESS SIGNATURE and TYPE = BLOB_STORAGE, see Create an external data source to execute bulk operations and retrieve data from Azure Storage into SQL Database

若要建立資料庫範圍認證,請參閱 CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)To create a database scoped credential, see CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL).

TYPE = [ BLOB_STORAGE | RDBMS | SHARD_MAP_MANAGER]TYPE = [ BLOB_STORAGE | RDBMS | SHARD_MAP_MANAGER]

指定要設定的外部資料來源類型。Specifies the type of the external data source being configured. 不一定需要此參數。This parameter isn't always required.

  • 使用 RDBMS 處理使用 SQL Database 彈性查詢的跨資料庫查詢。Use RDBMS for cross-database queries using elastic query from SQL Database.
  • 建立外部資料來源以連線到共用 SQL Database 時,請使用 SHARD_MAP_MANAGERUse SHARD_MAP_MANAGER when creating an external data source when connecting to a sharded SQL Database.
  • 使用 BULK INSERTOPENROWSET 執行大量作業時,請使用 BLOB_STORAGEUse BLOB_STORAGE when executing bulk operations with BULK INSERT, or OPENROWSET.

重要

如果使用任何其他外部資料來源,請不要設定 TYPEDo not set TYPE if using any other external data source.

DATABASE_NAME = database_nameDATABASE_NAME = database_name

TYPE 設定為 RDBMSSHARD_MAP_MANAGER 時,請設定此引數。Configure this argument when the TYPE is set to RDBMS or SHARD_MAP_MANAGER.

TYPETYPE DATABASE_NAME 的值Value of DATABASE_NAME
RDBMSRDBMS 使用 LOCATION 所提供伺服器上的遠端資料庫名稱The name of the remote database on the server provided using LOCATION
SHARD_MAP_MANAGERSHARD_MAP_MANAGER 以分區對應管理員運作的資料庫名稱Name of the database operating as the shard map manager

如需示範如何建立外部資料來源 (其中 TYPE =RDBMS) 的範例,請參閱建立 RDBMS 外部資料來源For an example showing how to create an external data source where TYPE = RDBMS refer to Create an RDBMS external data source

SHARD_MAP_NAME = shard_map_nameSHARD_MAP_NAME = shard_map_name

僅限在 TYPE 引數設定為 SHARD_MAP_MANAGER 時用於設定分區對應的名稱。Used when the TYPE argument is set to SHARD_MAP_MANAGER only to set the name of the shard map.

如需示範如何建立外部資料來源 (其中 TYPE =SHARD_MAP_MANAGER) 的範例,請參閱建立分區對應管理員外部資料來源For an example showing how to create an external data source where TYPE = SHARD_MAP_MANAGER refer to Create a shard map manager external data source

權限Permissions

需要有 Azure SQL DatabaseAzure SQL Database 資料庫的 CONTROL 權限。Requires CONTROL permission on database in Azure SQL DatabaseAzure SQL Database.

鎖定Locking

EXTERNAL DATA SOURCE 物件採取共用鎖定。Takes a shared lock on the EXTERNAL DATA SOURCE object.

範例:Examples:

A.A. 建立分區對應管理員外部資料來源Create a shard map manager external data source

若要建立參考 SHARD_MAP_MANAGER 的外部資料來源,請指定要在 SQL Database 或虛擬機器上的 SQL Server 資料庫中裝載分區對應管理員的 SQL Database 伺服器名稱。To create an external data source to reference a SHARD_MAP_MANAGER, specify the SQL Database server name that hosts the shard map manager in SQL Database or a SQL Server database on a virtual machine.

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>' ;

CREATE DATABASE SCOPED CREDENTIAL ElasticDBQueryCred
WITH
  IDENTITY = '<username>',
  SECRET = '<password>' ;

CREATE EXTERNAL DATA SOURCE MyElasticDBQueryDataSrc
WITH
  ( TYPE = SHARD_MAP_MANAGER ,
    LOCATION = '<server_name>.database.windows.net' ,
    DATABASE_NAME = 'ElasticScaleStarterKit_ShardMapManagerDb' ,
    CREDENTIAL = ElasticDBQueryCred ,
    SHARD_MAP_NAME = 'CustomerIDShardMap'
  ) ;

如需逐步教學課程,請參閱分區彈性查詢入門 (水平資料分割)For a step-by-step tutorial, see Getting started with elastic queries for sharding (horizontal partitioning).

B.B. 建立 RDBMS 外部資料來源Create an RDBMS external data source

若要建立參考 RDBMS 的外部資料來源,請在 SQL Database 中指定遠端資料庫的 SQL Database 伺服器名稱。To create an external data source to reference an RDBMS, specifies the SQL Database server name of the remote database in SQL Database.

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>' ;

CREATE DATABASE SCOPED CREDENTIAL SQL_Credential
WITH
  IDENTITY = '<username>' ,
  SECRET = '<password>' ;

CREATE EXTERNAL DATA SOURCE MyElasticDBQueryDataSrc
WITH
  ( TYPE = RDBMS ,
    LOCATION = '<server_name>.database.windows.net' ,
    DATABASE_NAME = 'Customers' ,
    CREDENTIAL = SQL_Credential
  ) ;

如需有關 RDBMS 的逐步教學課程,請參閱跨資料庫查詢入門 (垂直資料分割)For a step-by-step tutorial on RDBMS, see Getting started with cross-database queries (vertical partitioning).

範例:大量作業Examples: Bulk Operations

重要

設定大量作業的外部資料來源時,請不要在 LOCATION URL 的結尾處新增尾端 / 、檔案名稱或共用存取簽章參數。Do not add a trailing /, file name, or shared access signature parameters at the end of the LOCATION URL when configuring an external data source for bulk operations.

C.C. 針對從 Azure 儲存體擷取資料的大量作業,建立外部資料來源Create an external data source for bulk operations retrieving data from Azure Storage

針對使用 BULK INSERTOPENROWSET 的大量作業,請使用下列資料來源。Use the following data source for bulk operations using BULK INSERT or OPENROWSET. 認證必須將 SHARED ACCESS SIGNATURE 設定為身分識別、不得在 SAS 權杖中有前置 ?、必須至少擁有應載入檔案的讀取權限 (例如 srt=o&sp=r),且到期時間應該有效 (所有日期都是 UTC 時間)。The credential must set SHARED ACCESS SIGNATURE as the identity, mustn't have the leading ? in the SAS token, must have at least read permission on the file that should be loaded (for example srt=o&sp=r), and the expiration period should be valid (all dates are in UTC time). 如需共用存取簽章的詳細資訊,請參閱使用共用存取簽章 (SAS)For more information on shared access signatures, see Using Shared Access Signatures (SAS).

CREATE DATABASE SCOPED CREDENTIAL AccessAzureInvoices
WITH
  IDENTITY = 'SHARED ACCESS SIGNATURE',
  -- Remove ? from the beginning of the SAS token
  SECRET = '******srt=sco&sp=rwac&se=2017-02-01T00:55:34Z&st=2016-12-29T16:55:34Z***************' ;

CREATE EXTERNAL DATA SOURCE MyAzureInvoices
WITH
  ( LOCATION = 'https://newinvoices.blob.core.windows.net/week3' ,
    CREDENTIAL = AccessAzureInvoices ,
    TYPE = BLOB_STORAGE
  ) ;

若要查看使用中的這個範例,請參閱 BULK INSERTTo see this example in use, see BULK INSERT.

另請參閱See Also

* Azure Synapse
Analytics *
 
* Azure Synapse
Analytics *
 

 

概觀:Azure Synapse AnalyticsOverview: Azure Synapse Analytics

建立 PolyBase 的外部資料來源。Creates an external data source for PolyBase. 外部資料來源會用來建立連線能力,並支援下列主要使用案例:使用 PolyBase 來執行資料虛擬化和資料載入External data sources are used to establish connectivity and support the following primary use case: Data virtualization and data load using PolyBase

重要

若要建立外部資料來源,以使用 Azure SQL Database 搭配彈性查詢來查詢 SQL Analytics 資源,請參閱 SQL DatabaseTo create an external data source to query a SQL Analytics resource using Azure SQL Database with elastic query, see SQL Database.

語法Syntax

CREATE EXTERNAL DATA SOURCE <data_source_name>
WITH
  ( [ LOCATION = '<prefix>://<path>[:<port>]' ]
    [ [ , ] CREDENTIAL = <credential_name> ]
    [ [ , ] TYPE = HADOOP ]
[ ; ]

引數Arguments

data_source_namedata_source_name

指定資料來源的使用者定義名稱。Specifies the user-defined name for the data source. 名稱在 Azure Synapse AnalyticsAzure Synapse AnalyticsAzure SQL DatabaseAzure SQL Database 中必須為唯一。The name must be unique within the Azure SQL DatabaseAzure SQL Database in Azure Synapse AnalyticsAzure Synapse Analytics.

LOCATION = '<prefix>://<path[:port]>'LOCATION = '<prefix>://<path[:port]>'

提供連線通訊協定和路徑給外部資料來源。Provides the connectivity protocol and path to the external data source.

外部資料來源External Data Source 位置前置詞Location prefix 位置路徑Location path
Azure Data Lake Store Gen 1Azure Data Lake Store Gen 1 adl <storage_account>.azuredatalake.net
Azure Data Lake Store Gen 2Azure Data Lake Store Gen 2 abfs[s] <container>@<storage_account>.dfs.core.windows.net
Azure V2 儲存體帳戶Azure V2 Storage account wasb[s] <container>@<storage_account>.blob.core.windows.net

位置路徑:Location path:

  • <container> = 保留資料的儲存體帳戶容器。<container> = the container of the storage account holding the data. 根容器是唯讀的,因此無法將資料寫回至容器。Root containers are read-only, data can't be written back to the container.
  • <storage_account> = Azure 資源的儲存體帳戶名稱。<storage_account> = the storage account name of the Azure resource.

設定位置時的其他注意事項和指引:Additional notes and guidance when setting the location:

  • 預設選項是在佈建 Azure Data Lake Storage Gen 2 時使用 enable secure SSL connectionsThe default option is to use enable secure SSL connections when provisioning Azure Data Lake Storage Gen 2. 當啟用此項目時,您必須在選取了安全 TLS/SSL 連線時使用 abfssWhen this is enabled, you must use abfss when a secure TLS/SSL connection is selected. 請注意,abfss 也適用於不安全的 TLS 連線。Note abfssworks for unsecure TLS connections as well.
  • 在建立物件時,Azure Synapse 不會驗證外部資料來源是否存在。Azure Synapse doesn't verify the existence of the external data source when the object is created. .. 若要驗證,請使用外部資料來源建立外部資料表。To validate, create an external table using the external data source.
  • 查詢 Hadoop 時,請針對所有資料表使用相同的外部資料來源,以確保查詢語意一致。Use the same external data source for all tables when querying Hadoop to ensure consistent querying semantics.
  • 由於資料會使用安全的 TLS 連線傳送,因此建議使用 wasbswasbs is recommended as data will be sent using a secure TLS connection
  • 使用 wasb://介面透過 PolyBase 存取資料時,不支援搭配 Azure V2 儲存體帳戶使用階層命名空間。Hierarchical Namespaces aren't supported with Azure V2 Storage Accounts when accessing data via PolyBase using the wasb:// interface.

CREDENTIAL = credential_nameCREDENTIAL = credential_name

指定資料庫範圍的認證,以便向外部資料來源進行驗證。Specifies a database-scoped credential for authenticating to the external data source.

建立認證時的其他注意事項和指引:Additional notes and guidance when creating a credential:

  • 若要將資料從 Azure 儲存體或 Azure Data Lake Store (ADLS) Gen 2 載入 Azure Synapse Analytics,請使用 Azure 儲存體金鑰。To load data from Azure Storage or Azure Data Lake Store (ADLS) Gen 2 into Azure Synapse Analytics, use an Azure Storage Key.
  • 只有在資料受到保護時才需要 CREDENTIALCREDENTIAL is only required if the data has been secured. 允許匿名存取的資料集不需要 CREDENTIALCREDENTIAL isn't required for data sets that allow anonymous access.

若要建立資料庫範圍認證,請參閱 CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)To create a database scoped credential, see CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL).

TYPE = HADOOPTYPE = HADOOP

指定要設定的外部資料來源類型。Specifies the type of the external data source being configured. 不一定需要此參數。This parameter isn't always required.

  • 當外部資料來源為 Azure 儲存體、ADLS Gen 1 或 ADLS Gen 2 時,請使用 HADOOP。Use HADOOP when the external data source is Azure Storage, ADLS Gen 1, or ADLS Gen 2.

如需使用 TYPE = HADOOP 從 Azure 儲存體載入資料的範例,請參閱使用服務主體建立參考 Azure Data Lake Store Gen 1 或 2 的外部資料來源For an example of using TYPE = HADOOP to load data from Azure Storage, see Create external data source to reference Azure Data Lake Store Gen 1 or 2 using a service principal.

權限Permissions

需要資料庫的 CONTROL 權限。Requires CONTROL permission on the database.

鎖定Locking

EXTERNAL DATA SOURCE 物件採取共用鎖定。Takes a shared lock on the EXTERNAL DATA SOURCE object.

安全性Security

PolyBase 支援大多數外部資料來源的 Proxy 驗證。PolyBase supports proxy based authentication for most external data sources. 建立資料庫範圍認證以建立 Proxy 帳戶。Create a database scoped credential to create the proxy account.

當您連線到 SQL Server 巨量資料叢集中的儲存體或資料集區時,系統會將使用者的認證傳遞到後端系統。When you connect to the storage or data pool in a SQL Server big data cluster, the user's credentials are passed through to the back-end system. 在資料集區本身中建立登入以啟用傳遞驗證。Create logins in the data pool itself to enable pass through authentication.

目前不支援 HADOOP 類型的 SAS 權杖。Currently a SAS token with type HADOOP is unsupported. 只支援搭配儲存體帳戶存取金鑰使用。It's only supported with a storage account access key. 嘗試使用 HADOOP 類型及 SAS 認證來建立外部資料來源時會失敗,並發生下列錯誤:Attempting to create an external data source with type HADOOP and a SAS credential fails with the following error:

Msg 105019, Level 16, State 1 - EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect. Java exception message: Parameters provided to connect to the Azure storage account are not valid.: Error [Parameters provided to connect to the Azure storage account are not valid.] occurred while accessing external file.'

範例:Examples:

A.A. 使用 wasb:// 介面建立存取 Azure 儲存體中資料的外部資料來源Create external data source to access data in Azure Storage using the wasb:// interface

在此範例中,外部資料來源是名為 logs 的 Azure V2 儲存體帳戶。In this example, the external data source is an Azure V2 Storage account named logs. 容器名稱為 dailyThe container is called daily. Azure 儲存體外部資料來源僅供資料傳輸使用。The Azure Storage external data source is for data transfer only. 不支援述詞下推。It doesn't support predicate push-down. 透過 wasb:// 介面存取資料時,不支援階層命名空間。Hierarchical namespaces are not supported when accessing data via the the wasb:// interface.

此範例示範如何建立資料庫範圍認證,以便向 Azure 儲存體進行驗證。This example shows how to create the database scoped credential for authentication to Azure Storage. 在資料庫認證祕密中指定 Azure 儲存體帳戶金鑰。Specify the Azure Storage account key in the database credential secret. 您可以在資料庫範圍認證身分識別中指定任何字串,因為系統不會使用它向 Azure 儲存體進行驗證。You can specify any string in database scoped credential identity as it isn't used during authentication to Azure storage.

-- Create a database master key if one does not already exist, using your own password. This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'S0me!nfo' ;

-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH
  IDENTITY = '<my_account>',
  SECRET = '<azure_storage_account_key>' ;

-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH
  ( LOCATION = 'wasbs://daily@logs.blob.core.windows.net/' ,
    CREDENTIAL = AzureStorageCredential ,
    TYPE = HADOOP
  ) ;

B.B. 使用服務主體建立外部資料來源,來參考 Azure Data Lake Store Gen 1 或 2Create external data source to reference Azure Data Lake Store Gen 1 or 2 using a service principal

Azure Data Lake Store 連線能力以您的 ADLS URI 與 Azure Active Directory 應用程式服務主體為基礎。Azure Data Lake Store connectivity can be based on your ADLS URI and your Azure Active directory Application's service principal. 說明如何建立此應用程式的文件可以在使用 Azure Active Directory 的 Data Lake Store 驗證中找到。Documentation for creating this application can be found at Data lake store authentication using Active Directory.

-- If you do not have a Master Key on your DW you will need to create one.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>' ;

-- These values come from your Azure Active Directory Application used to authenticate to ADLS
CREATE DATABASE SCOPED CREDENTIAL ADLS_credential
WITH
  -- IDENTITY = '<clientID>@<OAuth2.0TokenEndPoint>' ,
  IDENTITY = '536540b4-4239-45fe-b9a3-629f97591c0c@https://login.microsoftonline.com/42f988bf-85f1-41af-91ab-2d2cd011da47/oauth2/token' ,
  -- SECRET = '<KEY>'
  SECRET = 'BjdIlmtKp4Fpyh9hIvr8HJlUida/seM5kQ3EpLAmeDI=' 
;

-- For Gen 1 - Create an external data source
-- TYPE: HADOOP - PolyBase uses Hadoop APIs to access data in Azure Data Lake Storage.
-- LOCATION: Provide Data Lake Storage Gen 1 account name and URI
-- CREDENTIAL: Provide the credential created in the previous step
CREATE EXTERNAL DATA SOURCE AzureDataLakeStore
WITH
  ( LOCATION = 'adl://newyorktaxidataset.azuredatalakestore.net' ,
    CREDENTIAL = ADLS_credential ,
    TYPE = HADOOP
  ) ;

-- For Gen 2 - Create an external data source
-- TYPE: HADOOP - PolyBase uses Hadoop APIs to access data in Azure Data Lake Storage.
-- LOCATION: Provide Data Lake Storage Gen 2 account name and URI
-- CREDENTIAL: Provide the credential created in the previous step
CREATE EXTERNAL DATA SOURCE AzureDataLakeStore
WITH
  -- Please note the abfss endpoint when your account has secure transfer enabled
  ( LOCATION = 'abfss://data@newyorktaxidataset.dfs.core.windows.net' , 
    CREDENTIAL = ADLS_credential ,
    TYPE = HADOOP
  ) ;

C.C. 使用儲存體帳戶金鑰建立外部資料來源,來參考 Azure Data Lake Store Gen 2Create external data source to reference Azure Data Lake Store Gen 2 using the storage account key

-- If you do not have a Master Key on your DW you will need to create one.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>' ;

CREATE DATABASE SCOPED CREDENTIAL ADLS_credential
WITH
-- IDENTITY = '<storage_account_name>' ,
  IDENTITY = 'newyorktaxidata' ,
-- SECRET = '<storage_account_key>'
  SECRET = 'yz5N4+bxSb89McdiysJAzo+9hgEHcJRJuXbF/uC3mhbezES/oe00vXnZEl14U0lN3vxrFKsphKov16C0w6aiTQ=='
;

-- Note this example uses a Gen 2 secured endpoint (abfss)
CREATE EXTERNAL DATA SOURCE <data_source_name>
WITH
  ( LOCATION = 'abfss://2013@newyorktaxidataset.dfs.core.windows.net' ,
    CREDENTIAL = ADLS_credential ,
    TYPE = HADOOP
  ) ;

D.D. 建立外部資料來源以使用 abfs:// 參考 Polybase 與 Azure Data Lake Store Gen 2 的連線Create external data source to reference Polybase connectivity to Azure Data Lake Store Gen 2 using abfs://

當連線至具有受控識別機制的 Azure Data Lake 存放區 Gen2 帳戶時,不需要指定祕密。There is no need to specify SECRET when connecting to Azure Data Lake Store Gen2 account with Managed Identity mechanism.

-- If you do not have a Master Key on your DW you will need to create one
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>' ;

--Create database scoped credential with **IDENTITY = 'Managed Service Identity'**

CREATE DATABASE SCOPED CREDENTIAL msi_cred 
WITH IDENTITY = 'Managed Service Identity' ;

--Create external data source with abfss:// scheme for connecting to your Azure Data Lake Store Gen2 account

CREATE EXTERNAL DATA SOURCE ext_datasource_with_abfss 
WITH 
  ( TYPE = HADOOP , 
    LOCATION = 'abfss://myfile@mystorageaccount.dfs.core.windows.net' , 
    CREDENTIAL = msi_cred
  ) ;

另請參閱See Also

* Analytics
Platform System (PDW) *
 
* Analytics
Platform System (PDW) *
 

 

概觀:分析平台系統Overview: Analytics Platform System

建立 PolyBase 查詢的外部資料來源。Creates an external data source for PolyBase queries. 外部資料來源會用來建立連線能力,並支援下列使用案例:使用 PolyBase 來執行資料虛擬化和資料載入External data sources are used to establish connectivity and support the following use case: Data virtualization and data load using PolyBase

語法Syntax

CREATE EXTERNAL DATA SOURCE <data_source_name>
WITH
  ( [ LOCATION = '<prefix>://<path>[:<port>]' ]
    [ [ , ] CREDENTIAL = <credential_name> ]
    [ [ , ] TYPE = HADOOP ]
    [ [ , ] RESOURCE_MANAGER_LOCATION = '<resource_manager>[:<port>]' )
[ ; ]

引數Arguments

data_source_namedata_source_name

指定資料來源的使用者定義名稱。Specifies the user-defined name for the data source. 名稱在 平行處理資料倉儲Parallel Data Warehouse 的伺服器中必須為唯一。The name must be unique within the server in 平行處理資料倉儲Parallel Data Warehouse.

LOCATION = '<prefix>://<path[:port]>'LOCATION = '<prefix>://<path[:port]>'

提供連線通訊協定和路徑給外部資料來源。Provides the connectivity protocol and path to the external data source.

外部資料來源External Data Source 位置前置詞Location prefix 位置路徑Location path
Cloudera 或 HortonworksCloudera or Hortonworks hdfs <Namenode>[:port]
Azure 儲存體帳戶Azure Storage Account wasb[s] <container>@<storage_account>.blob.core.windows.net

位置路徑:Location path:

  • <Namenode> = Hadoop 叢集中電腦名稱、名稱服務 URI 或 Namenode 的 IP 位址。<Namenode> = the machine name, name service URI, or IP address of the Namenode in the Hadoop cluster. PolyBase 必須解析 Hadoop 叢集所使用的任何 DNS 名稱。PolyBase must resolve any DNS names used by the Hadoop cluster.
  • port = 外部資料來源正在接聽的連接埠。port = The port that the external data source is listening on. 在 Hadoop 中,此連接埠可使用 fs.defaultFS 設定參數來尋找。In Hadoop, the port can be found using the fs.defaultFS configuration parameter. 預設值為 8020。The default is 8020.
  • <container> = 保留資料的儲存體帳戶容器。<container> = the container of the storage account holding the data. 根容器是唯讀的,因此無法將資料寫回至容器。Root containers are read-only, data can't be written back to the container.
  • <storage_account> = Azure 資源的儲存體帳戶名稱。<storage_account> = the storage account name of the Azure resource.

設定位置時的其他注意事項和指引:Additional notes and guidance when setting the location:

  • 在建立物件時,PDW 引擎不會驗證外部資料來源是否存在。The PDW engine doesn't verify the existence of the external data source when the object is created. 若要驗證,請使用外部資料來源建立外部資料表。To validate, create an external table using the external data source.
  • 查詢 Hadoop 時,請針對所有資料表使用相同的外部資料來源,以確保查詢語意一致。Use the same external data source for all tables when querying Hadoop to ensure consistent querying semantics.
  • 由於資料會使用安全的 TLS 連線傳送,因此建議使用 wasbswasbs is recommended as data will be sent using a secure TLS connection.
  • 透過 wasb:// 搭配 Azure 儲存體帳戶使用時,不支援階層命名空間。Hierarchical Namespaces are not supported when used with Azure Storage accounts over wasb://.
  • 為確保在 Hadoop Namenode 容錯移轉期間能成功進行 PolyBase 查詢,請考慮使用虛擬 IP 位址作為 Hadoop 叢集的 NamenodeTo ensure successful PolyBase queries during a Hadoop Namenode fail-over, consider using a virtual IP address for the Namenode of the Hadoop cluster. 若未這樣做,請執行 ALTER EXTERNAL DATA SOURCE 命令以指向新位置。If you don't, execute an ALTER EXTERNAL DATA SOURCE command to point to the new location.

CREDENTIAL = credential_nameCREDENTIAL = credential_name

指定資料庫範圍的認證,以便向外部資料來源進行驗證。Specifies a database-scoped credential for authenticating to the external data source.

建立認證時的其他注意事項和指引:Additional notes and guidance when creating a credential:

  • 若要將 Azure 儲存體的資料載入 Azure Synapse 或 PDW,請使用 Azure 儲存體金鑰。To load data from Azure Storage into Azure Synapse or PDW, use an Azure Storage Key.
  • 只有在資料受到保護時才需要 CREDENTIALCREDENTIAL is only required if the data has been secured. 允許匿名存取的資料集不需要 CREDENTIALCREDENTIAL isn't required for data sets that allow anonymous access.

TYPE = [ HADOOP ]TYPE = [ HADOOP ]

指定要設定的外部資料來源類型。Specifies the type of the external data source being configured. 不一定需要此參數。This parameter isn't always required.

  • 當外部資料來源為 Cloudera、Hortonworks 或 Azure 儲存體時,請使用 HADOOP。Use HADOOP when the external data source is Cloudera, Hortonworks, or Azure Storage.

如需使用 TYPE = HADOOP 從 Azure 儲存體載入資料的範例,請參閱建立參考 Hadoop 的外部資料來源For an example of using TYPE = HADOOP to load data from Azure Storage, see Create external data source to reference Hadoop.

RESOURCE_MANAGER_LOCATION = 'ResourceManager_URI[:port]'RESOURCE_MANAGER_LOCATION = 'ResourceManager_URI[:port]'

連線到 Hortonworks 或 Cloudera 時,請設定這個選用值。Configure this optional value when connecting to Hortonworks or Cloudera.

定義 RESOURCE_MANAGER_LOCATION 時,查詢最佳化工具會制訂成本型決策以改善效能。When the RESOURCE_MANAGER_LOCATION is defined, the query optimizer will make a cost-based decision to improve performance. MapReduce 作業可用於將計算下推到 Hadoop。A MapReduce job can be used to push down the computation to Hadoop. 指定 RESOURCE_MANAGER_LOCATION 可大幅降低在 Hadoop 與 SQL 之間傳輸的資料量,而導致查詢效能獲得改善。Specifying the RESOURCE_MANAGER_LOCATION can significantly reduce the volume of data transferred between Hadoop and SQL, which can lead to improved query performance.

若未指定 Resource Manager,系統會針對 PolyBase 查詢停用將計算推送到 Hadoop。If the Resource Manager isn't specified, pushing compute to Hadoop is disabled for PolyBase queries.

若未指定連接埠,系統會使用 'hadoop connectivity' 設定的目前設定選擇預設值。If the port isn't specified, the default value is chosen using the current setting for 'hadoop connectivity' configuration.

Hadoop 連線能力Hadoop Connectivity 預設資源管理員連接埠Default Resource Manager Port
11 5030050300
22 5030050300
33 80218021
44 80328032
55 80508050
66 80328032
77 80508050

如需支援 Hadoop 版本的完整清單,請參閱 PolyBase 連線設定 (Transact-SQL)For a complete list of supported Hadoop versions, see PolyBase Connectivity Configuration (Transact-SQL).

重要

當您建立外部資料來源時,不會驗證 RESOURCE_MANAGER_LOCATION 值。The RESOURCE_MANAGER_LOCATION value is not validated when you create the external data source. 輸入不正確的值可能會導致每次嘗試下推就在執行時間發生查詢失敗,因為無法解析所提供的值。Entering an incorrect value may cause query failure at execution time whenever push-down is attempted as the provided value would not be able to resolve.

在已啟用下推的情況下,建立參考 Hadoop 的外部資料來源提供具體範例及進一步指引。Create external data source to reference Hadoop with push-down enabled provides a concrete example and further guidance.

權限Permissions

需要有 平行處理資料倉儲Parallel Data Warehouse 資料庫的 CONTROL 權限。Requires CONTROL permission on database in 平行處理資料倉儲Parallel Data Warehouse.

注意

在舊版 PDW 中,建立外部資料來源需要有 ALTER ANY EXTERNAL DATA SOURCE 權限。In previous releases of PDW, create external data source required ALTER ANY EXTERNAL DATA SOURCE permissions.

鎖定Locking

EXTERNAL DATA SOURCE 物件採取共用鎖定。Takes a shared lock on the EXTERNAL DATA SOURCE object.

安全性Security

PolyBase 支援大多數外部資料來源的 Proxy 驗證。PolyBase supports proxy based authentication for most external data sources. 建立資料庫範圍認證以建立 Proxy 帳戶。Create a database scoped credential to create the proxy account.

目前不支援 HADOOP 類型的 SAS 權杖。Currently a SAS token with type HADOOP is unsupported. 只支援搭配儲存體帳戶存取金鑰使用。It's only supported with a storage account access key. 嘗試使用 HADOOP 類型及 SAS 認證來建立外部資料來源時會失敗,並發生下列錯誤:Attempting to create an external data source with type HADOOP and a SAS credential fails with the following error:

Msg 105019, Level 16, State 1 - EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect. Java exception message: Parameters provided to connect to the Azure storage account are not valid.: Error [Parameters provided to connect to the Azure storage account are not valid.] occurred while accessing external file.'

範例:Examples:

A.A. 建立參考 Hadoop 的外部資料來源Create external data source to reference Hadoop

若要建立參考 Hortonworks 或 Cloudera Hadoop 叢集的外部資料來源,請指定 Hadoop Namenode 的電腦名稱或 IP 位址與連接埠。To create an external data source to reference your Hortonworks or Cloudera Hadoop cluster, specify the machine name, or IP address of the Hadoop Namenode and port.

CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH
  ( LOCATION = 'hdfs://10.10.10.10:8050' ,
    TYPE = HADOOP
  ) ;

B.B. 在已啟用下推的情況下,建立參考 Hadoop 的外部資料來源Create external data source to reference Hadoop with push-down enabled

指定 RESOURCE_MANAGER_LOCATION 選項,以便對適用於 PolyBase 查詢的 Hadoop 啟用下推計算。Specify the RESOURCE_MANAGER_LOCATION option to enable push-down computation to Hadoop for PolyBase queries. 啟用之後,PolyBase 會制訂成本型決策來判斷是否應該將查詢計算推送到 Hadoop。Once enabled, PolyBase makes a cost-based decision to determine whether the query computation should be pushed to Hadoop.

CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH
  ( LOCATION = 'hdfs://10.10.10.10:8020'
    TYPE = HADOOP
    RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
) ;

C.C. 建立參考由 Kerberos 保護之 Hadoop 的外部資料來源Create external data source to reference Kerberos-secured Hadoop

若要確認 Hadoop 叢集是否由 Kerberos 保護,請檢查 Hadoop core-site.xml 中 hadoop.security.authentication 屬性的值。To verify if the Hadoop cluster is Kerberos-secured, check the value of hadoop.security.authentication property in Hadoop core-site.xml. 若要參考由 Kerberos 保護的 Hadoop 叢集,您必須指定資料庫範圍的認證,其中包含您的 Kerberos 使用者名稱與密碼。To reference a Kerberos-secured Hadoop cluster, you must specify a database scoped credential that contains your Kerberos username and password. 資料庫主要金鑰用來加密資料庫範圍的認證密碼。The database master key is used to encrypt the database scoped credential secret.

-- Create a database master key if one does not already exist, using your own password. This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'S0me!nfo' ;

-- Create a database scoped credential with Kerberos user name and password.
CREATE DATABASE SCOPED CREDENTIAL HadoopUser1
WITH
  IDENTITY = '<hadoop_user_name>' ,
  SECRET = '<hadoop_password>' ;

-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH
  ( LOCATION = 'hdfs://10.10.10.10:8050' ,
    CREDENTIAL = HadoopUser1 ,
    TYPE = HADOOP ,
    RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
  ) ;

D.D. 使用 wasb:// 介面建立存取 Azure 儲存體中資料的外部資料來源Create external data source to access data in Azure Storage using the wasb:// interface

在此範例中,外部資料來源是名為 logs 的 Azure V2 儲存體帳戶。In this example, the external data source is an Azure V2 Storage account named logs. 容器名稱為 dailyThe container is called daily. Azure 儲存體外部資料來源僅供資料傳輸使用。The Azure Storage external data source is for data transfer only. 不支援述詞下推。It doesn't support predicate push-down. 透過 wasb:// 介面存取資料時,不支援階層命名空間。Hierarchical namespaces are not supported when accessing data via the the wasb:// interface.

此範例示範如何建立資料庫範圍的認證,以便向 Azure 儲存體進行驗證。This example shows how to create the database scoped credential for authentication to Azure storage. 在資料庫認證密碼中指定 Azure 儲存體帳戶金鑰。Specify the Azure storage account key in the database credential secret. 您可以在資料庫範圍認證身分識別中指定任何字串,因為系統不會使用它向 Azure 儲存體進行驗證。You can specify any string in database scoped credential identity as it isn't used during authentication to Azure storage.

-- Create a database master key if one does not already exist, using your own password. This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'S0me!nfo' ;

-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH
  IDENTITY = '<my_account>' ,
  SECRET = '<azure_storage_account_key>' ;

-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH
  ( LOCATION = 'wasbs://daily@logs.blob.core.windows.net/'
    CREDENTIAL = AzureStorageCredential
    TYPE = HADOOP
  ) ;

另請參閱See Also