合併至 (Azure Databricks 上的 Delta Lake)Merge Into (Delta Lake on Azure Databricks)
根據來源資料表將一組更新、插入和刪除合併至目標 Delta 資料表。Merge a set of updates, insertions, and deletions based on a source table into a target Delta table.
MERGE INTO [db_name.]target_table [AS target_alias]
USING [db_name.]source_table [<time_travel_version>] [AS source_alias]
ON <merge_condition>
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN NOT MATCHED [ AND <condition> ] THEN <not_matched_action> ]
wherewhere
<matched_action> =
DELETE |
UPDATE SET * |
UPDATE SET column1 = value1 [, column2 = value2 ...]
<not_matched_action> =
INSERT * |
INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])
<time_travel_version> =
TIMESTAMP AS OF timestamp_expression |
VERSION AS OF version
- 在 Databricks Runtime 5.5 LTS 和6.x 中,最
MERGE
多可以有2個WHEN MATCHED
子句以及最多1個WHEN NOT MATCHED
子句。In Databricks Runtime 5.5 LTS and 6.x,MERGE
can have at most 2WHEN MATCHED
clauses and at most 1WHEN NOT MATCHED
clause. WHEN MATCHED
當來源資料列符合以比對條件為基礎的目標資料表資料列時,就會執行子句。WHEN MATCHED
clauses are executed when a source row matches a target table row based on the match condition. 這些子句具有下列語義。These clauses have the following semantics.WHEN MATCHED
子句最多可有UPDATE
一項DELETE
動作。WHEN MATCHED
clauses can have at most onUPDATE
and oneDELETE
action.UPDATE
中的動作merge
只會更新相符目標資料列的指定資料行。TheUPDATE
action inmerge
only updates the specified columns of the matched target row. 此DELETE
動作將會刪除相符的資料列。TheDELETE
action will delete the matched row.- 每個
WHEN MATCHED
子句都可以有選擇性的條件。EachWHEN MATCHED
clause can have an optional condition. 如果這個子句條件存在,則UPDATE
DELETE
只有當子句條件為 true 時,才會針對任何相符的來源目標資料列組資料列執行或動作。If this clause condition exists, theUPDATE
orDELETE
action is executed for any matching source-target row pair row only when when the clause condition is true. - 如果有多個
WHEN MATCHED
子句,則會依指定的順序來評估它們 (也就是,子句的順序) 。If there are multipleWHEN MATCHED
clauses, then they are evaluated in order they are specified (that is, the order of the clauses matter).WHEN MATCHED
除了最後一個子句以外,所有子句都必須有條件。AllWHEN MATCHED
clauses, except the last one, must have conditions. - 如果兩個
WHEN MATCHED
子句都有條件,而且兩個條件都不符合相符的來源目標資料列組,則相符的目標資料列將保持不變。If bothWHEN MATCHED
clauses have conditions and neither of the conditions are true for a matching source-target row pair, then the matched target row is left unchanged. - 若要使用源資料集的對應資料行來更新目標 Delta 資料表的所有資料行,請使用
UPDATE SET *
。To update all the columns of the target Delta table with the corresponding columns of the source dataset, useUPDATE SET *
. 這相當於UPDATE SET col1 = source.col1 [, col2 = source.col2 ...]
目標 Delta 資料表的所有資料行。This is equivalent toUPDATE SET col1 = source.col1 [, col2 = source.col2 ...]
for all the columns of the target Delta table. 因此,此動作會假設來源資料表的資料行與目標資料表中的資料行相同,否則查詢將會擲回分析錯誤。Therefore, this action assumes that the source table has the same columns as those in the target table, otherwise the query will throw an analysis error.- 啟用自動架構遷移時,會變更這個行為。This behavior changes when automatic schema migration is enabled. 如需詳細資料,請參閱 自動架構演進 。See Automatic schema evolution for details.
WHEN NOT MATCHED
當來源資料列不符合任何以相符條件為基礎的目標資料列時,就會執行子句。WHEN NOT MATCHED
clauses are executed when a source row does not match any target row based on the match condition. 這些子句具有下列語義。These clauses have the following semantics.WHEN NOT MATCHED
子句只能有INSERT
動作。WHEN NOT MATCHED
clauses can only have theINSERT
action. 新的資料列會根據指定的資料行和對應的運算式產生。The new row is generated based on the specified column and corresponding expressions. 不需要指定目標資料表中的所有資料行。All the columns in the target table do not need to be specified. 針對未指定的目標資料行,NULL
將會插入。For unspecified target columns,NULL
will be inserted.注意
在 Databricks Runtime 6.5 和以下的,您必須在目標資料表中提供動作的所有資料行
INSERT
。In Databricks Runtime 6.5 and below, you must provide all columns in the target table for theINSERT
action.每個
WHEN NOT MATCHED
子句都可以有選擇性的條件。EachWHEN NOT MATCHED
clause can have an optional condition. 如果有子句條件,則只有在該資料列的條件為 true 時,才會插入來源資料列。If the clause condition is present, a source row is inserted only if that condition is true for that row. 否則會忽略來源資料行。Otherwise, the source column is ignored.如果有多個
WHEN NOT MATCHED
子句,則會依指定的順序來評估它們 (也就是,子句的順序) 。If there are multipleWHEN NOT MATCHED
clauses, then they are evaluated in order they are specified (that is, the order of the clauses matter).WHEN NOT MATCHED
除了最後一個子句以外,所有子句都必須有條件。AllWHEN NOT MATCHED
clauses, except the last one, must have conditions.若要使用源資料集的對應資料行插入目標 Delta 資料表的所有資料行,請使用
INSERT *
。To insert all the columns of the target Delta table with the corresponding columns of the source dataset, useINSERT *
. 這相當於INSERT (col1 [, col2 ...]) VALUES (source.col1 [, source.col2 ...])
目標 Delta 資料表的所有資料行。This is equivalent toINSERT (col1 [, col2 ...]) VALUES (source.col1 [, source.col2 ...])
for all the columns of the target Delta table. 因此,此動作會假設來源資料表的資料行與目標資料表中的資料行相同,否則查詢將會擲回分析錯誤。Therefore, this action assumes that the source table has the same columns as those in the target table, otherwise the query will throw an analysis error.注意
啟用自動架構遷移時,會變更這個行為。This behavior changes when automatic schema migration is enabled. 如需詳細資料,請參閱 自動架構演進 。See Automatic schema evolution for details.
重要
MERGE
如果源資料集的多個資料列相符,並嘗試更新目標 Delta 資料表的相同資料列,則作業可能會失敗。A MERGE
operation can fail if multiple rows of the source dataset match and attempt to update the same rows of the target Delta table. 根據合併的 SQL 語法,這類更新作業是不明確的,因為不清楚應該使用哪一個來源資料列來更新相符的目標資料列。According to the SQL semantics of merge, such an update operation is ambiguous as it is unclear which source row should be used to update the matched target row. 您可以前置處理來源資料表,以消除多個相符專案的可能性。You can preprocess the source table to eliminate the possibility of multiple matches. 請參閱 變更資料捕獲範例-它會將變更資料集前置 (也就是源資料集) 在將變更套用至目標 Delta 資料表之前,只保留每個索引鍵的最新變更。See the Change data capture example—it preprocesses the change dataset (that is, the source dataset) to retain only the latest change for each key before applying that change into the target Delta table.
範例Examples
您可以使用 MERGE
來執行複雜的作業,例如刪除重復資料資料、更新插入變更資料、套用 SCD 類型2作業。You can use MERGE
for complex operations such as deduplicating data, upserting change data, applying SCD Type 2 operations. 如需一些範例,請參閱 合併範例 。See Merge examples for a few examples.