合併至 (Azure Databricks 上的 Delta Lake)Merge Into (Delta Lake on Azure Databricks)

根據來源資料表將一組更新、插入和刪除合併至目標 Delta 資料表。Merge a set of updates, insertions, and deletions based on a source table into a target Delta table.

MERGE INTO [db_name.]target_table [AS target_alias]
USING [db_name.]source_table [<time_travel_version>] [AS source_alias]
ON <merge_condition>
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN MATCHED [ AND <condition> ] THEN <matched_action> ]
[ WHEN NOT MATCHED [ AND <condition> ]  THEN <not_matched_action> ]

wherewhere

<matched_action>  =
  DELETE  |
  UPDATE SET *  |
  UPDATE SET column1 = value1 [, column2 = value2 ...]

<not_matched_action>  =
  INSERT *  |
  INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 ...])

<time_travel_version>  =
  TIMESTAMP AS OF timestamp_expression |
  VERSION AS OF version
  • 在 Databricks Runtime 5.5 LTS 和6.x 中,最 MERGE 多可以有2個 WHEN MATCHED 子句以及最多1個 WHEN NOT MATCHED 子句。In Databricks Runtime 5.5 LTS and 6.x, MERGE can have at most 2 WHEN MATCHED clauses and at most 1 WHEN NOT MATCHED clause.
  • WHEN MATCHED 當來源資料列符合以比對條件為基礎的目標資料表資料列時,就會執行子句。WHEN MATCHED clauses are executed when a source row matches a target table row based on the match condition. 這些子句具有下列語義。These clauses have the following semantics.
    • WHEN MATCHED 子句最多可有 UPDATE 一項 DELETE 動作。WHEN MATCHED clauses can have at most on UPDATE and one DELETE action. UPDATE中的動作 merge 只會更新相符目標資料列的指定資料行。The UPDATE action in merge only updates the specified columns of the matched target row. DELETE 動作將會刪除相符的資料列。The DELETE action will delete the matched row.
    • 每個 WHEN MATCHED 子句都可以有選擇性的條件。Each WHEN MATCHED clause can have an optional condition. 如果這個子句條件存在,則 UPDATE DELETE 只有當子句條件為 true 時,才會針對任何相符的來源目標資料列組資料列執行或動作。If this clause condition exists, the UPDATE or DELETE action is executed for any matching source-target row pair row only when when the clause condition is true.
    • 如果有多個 WHEN MATCHED 子句,則會依指定的順序來評估它們 (也就是,子句的順序) 。If there are multiple WHEN MATCHED clauses, then they are evaluated in order they are specified (that is, the order of the clauses matter). WHEN MATCHED除了最後一個子句以外,所有子句都必須有條件。All WHEN MATCHED clauses, except the last one, must have conditions.
    • 如果兩個 WHEN MATCHED 子句都有條件,而且兩個條件都不符合相符的來源目標資料列組,則相符的目標資料列將保持不變。If both WHEN MATCHED clauses have conditions and neither of the conditions are true for a matching source-target row pair, then the matched target row is left unchanged.
    • 若要使用源資料集的對應資料行來更新目標 Delta 資料表的所有資料行,請使用 UPDATE SET *To update all the columns of the target Delta table with the corresponding columns of the source dataset, use UPDATE SET *. 這相當於 UPDATE SET col1 = source.col1 [, col2 = source.col2 ...] 目標 Delta 資料表的所有資料行。This is equivalent to UPDATE SET col1 = source.col1 [, col2 = source.col2 ...] for all the columns of the target Delta table. 因此,此動作會假設來源資料表的資料行與目標資料表中的資料行相同,否則查詢將會擲回分析錯誤。Therefore, this action assumes that the source table has the same columns as those in the target table, otherwise the query will throw an analysis error.
  • WHEN NOT MATCHED 當來源資料列不符合任何以相符條件為基礎的目標資料列時,就會執行子句。WHEN NOT MATCHED clauses are executed when a source row does not match any target row based on the match condition. 這些子句具有下列語義。These clauses have the following semantics.
    • WHEN NOT MATCHED 子句只能有 INSERT 動作。WHEN NOT MATCHED clauses can only have the INSERT action. 新的資料列會根據指定的資料行和對應的運算式產生。The new row is generated based on the specified column and corresponding expressions. 不需要指定目標資料表中的所有資料行。All the columns in the target table do not need to be specified. 針對未指定的目標資料行, NULL 將會插入。For unspecified target columns, NULL will be inserted.

      注意

      在 Databricks Runtime 6.5 和以下的,您必須在目標資料表中提供動作的所有資料行 INSERTIn Databricks Runtime 6.5 and below, you must provide all columns in the target table for the INSERT action.

    • 每個 WHEN NOT MATCHED 子句都可以有選擇性的條件。Each WHEN NOT MATCHED clause can have an optional condition. 如果有子句條件,則只有在該資料列的條件為 true 時,才會插入來源資料列。If the clause condition is present, a source row is inserted only if that condition is true for that row. 否則會忽略來源資料行。Otherwise, the source column is ignored.

    • 如果有多個 WHEN NOT MATCHED 子句,則會依指定的順序來評估它們 (也就是,子句的順序) 。If there are multiple WHEN NOT MATCHED clauses, then they are evaluated in order they are specified (that is, the order of the clauses matter). WHEN NOT MATCHED除了最後一個子句以外,所有子句都必須有條件。All WHEN NOT MATCHED clauses, except the last one, must have conditions.

    • 若要使用源資料集的對應資料行插入目標 Delta 資料表的所有資料行,請使用 INSERT *To insert all the columns of the target Delta table with the corresponding columns of the source dataset, use INSERT *. 這相當於 INSERT (col1 [, col2 ...]) VALUES (source.col1 [, source.col2 ...]) 目標 Delta 資料表的所有資料行。This is equivalent to INSERT (col1 [, col2 ...]) VALUES (source.col1 [, source.col2 ...]) for all the columns of the target Delta table. 因此,此動作會假設來源資料表的資料行與目標資料表中的資料行相同,否則查詢將會擲回分析錯誤。Therefore, this action assumes that the source table has the same columns as those in the target table, otherwise the query will throw an analysis error.

      注意

      啟用自動架構遷移時,會變更這個行為。This behavior changes when automatic schema migration is enabled. 如需詳細資料,請參閱 自動架構演進See Automatic schema evolution for details.

重要

MERGE如果源資料集的多個資料列相符,並嘗試更新目標 Delta 資料表的相同資料列,則作業可能會失敗。A MERGE operation can fail if multiple rows of the source dataset match and attempt to update the same rows of the target Delta table. 根據合併的 SQL 語法,這類更新作業是不明確的,因為不清楚應該使用哪一個來源資料列來更新相符的目標資料列。According to the SQL semantics of merge, such an update operation is ambiguous as it is unclear which source row should be used to update the matched target row. 您可以前置處理來源資料表,以消除多個相符專案的可能性。You can preprocess the source table to eliminate the possibility of multiple matches. 請參閱 變更資料捕獲範例-它會將變更資料集前置 (也就是源資料集) 在將變更套用至目標 Delta 資料表之前,只保留每個索引鍵的最新變更。See the Change data capture example—it preprocesses the change dataset (that is, the source dataset) to retain only the latest change for each key before applying that change into the target Delta table.

範例Examples

您可以使用 MERGE 來執行複雜的作業,例如刪除重復資料資料、更新插入變更資料、套用 SCD 類型2作業。You can use MERGE for complex operations such as deduplicating data, upserting change data, applying SCD Type 2 operations. 如需一些範例,請參閱 合併範例See Merge examples for a few examples.