隔離層級Isolation levels

資料表的隔離等級定義交易必須與並行交易進行的修改隔離的程度。The isolation level of a table defines the degree to which a transaction must be isolated from modifications made by concurrent transactions. Azure Databricks 上的 Delta Lake 支援兩種隔離等級: Serializable 和 WriteSerializable。Delta Lake on Azure Databricks supports two isolation levels: Serializable and WriteSerializable.

  • Serializable:最強的隔離等級。Serializable: The strongest isolation level. 它可確保認可的寫入作業和所有讀取都可序列化。It ensures that committed write operations and all reads are Serializable. 只要一次執行一次順序,就會產生與資料表中所示相同的結果,因此允許執行作業。Operations are allowed as long as there exists a serial sequence of executing them one-at-a-time that generates the same outcome as that seen in the table. 針對寫入作業,序列順序與資料表歷程記錄中所顯示的完全相同。For the write operations, the serial sequence is exactly the same as that seen in the table’s history.

  • WriteSerializable (預設):比 Serializable 更弱的隔離等級。WriteSerializable (Default): A weaker isolation level than Serializable. 它只會確保寫入作業 (也就是,不是讀取) 可序列化。It ensures only that the write operations (that is, not reads) are serializable. 不過,這仍比 快照 集隔離更強。However, this is still stronger than Snapshot isolation. WriteSerializable 是預設的隔離等級,因為它可為最常見的作業提供資料一致性和可用性的絕佳平衡。WriteSerializable is the default isolation level because it provides great balance of data consistency and availability for most common operations.

    在此模式中,Delta 資料表的內容可能與資料表歷程記錄中所顯示的作業順序不同。In this mode, the content of the Delta table may be different from that which is expected from the sequence of operations seen in the table history. 這是因為這種模式允許特定的並行寫入組合 (比方說,作業 X 和 Y) 繼續進行,如此一來,結果就像是在 (X 之前執行 Y 一樣) ,即使在歷程記錄顯示 Y 是在 X 之後認可,也是如此。若不允許重新排列,請 將資料表隔離等級設定 為可序列化,使這些交易失敗。This is because this mode allows certain pairs of concurrent writes (say, operations X and Y) to proceed such that the result would be as if Y was performed before X (that is, serializable between them) even though the history would show that Y was committed after X. To disallow this reordering, set the table isolation level to be Serializable to cause these transactions to fail.

讀取作業一律使用快照集隔離。Read operations always use snapshot isolation. 寫入隔離等級會決定讀取器是否可能看到資料表的快照集(根據歷程記錄「永不存在」)。The write isolation level determines whether or not it is possible for a reader to see a snapshot of a table, that according to the history, “never existed”.

針對可序列化層級,讀取器一律只會看到符合歷程記錄的資料表。For the Serializable level, a reader always sees only tables that conform to the history. 針對 WriteSerializable 層級,讀取器可能會看到不存在差異記錄檔中的資料表。For the WriteSerializable level, a reader could see a table that does not exist in the Delta log.

例如,假設 txn1 是長時間執行的 delete 和 txn2,其會插入 txn1 所刪除的資料。For example, consider txn1, a long running delete and txn2, which inserts data deleted by txn1. txn2 和 txn1 完成,並以該順序記錄在歷程記錄中。txn2 and txn1 complete and they are recorded in that order in the history. 根據歷程記錄,在 txn2 中插入的資料不應該存在於資料表中。According to the history, the data inserted in txn2 should not exist in the table. 若為可序列化層級,讀取器將永遠不會看到 txn2 所插入的資料。For Serializable level, a reader would never see data inserted by txn2. 不過,針對 WriteSerializable 層級,讀取器可能會在某個時間點查看 txn2 所插入的資料。However, for the WriteSerializable level, a reader could at some point see the data inserted by txn2.

如需每個隔離等級與可能錯誤之間互相衝突的作業類型詳細資訊,請參閱 並行控制For more information on which types of operations can conflict with each other in each isolation level and the possible errors, see Concurrency control.

設定隔離等級 Set the isolation level

您可以使用命令來設定隔離等級 ALTER TABLEYou set the isolation level using the ALTER TABLE command.

ALTER TABLE <table-name> SET TBLPROPERTIES ('delta.isolationLevel' = <level-name>)

其中 <level-name>SerializableWriteSerializablewhere <level-name> is Serializable or WriteSerializable.

例如,若要將隔離等級從預設變更 WriteSerializableSerializable ,請執行:For example, to change the isolation level from the default WriteSerializable to Serializable, run:

ALTER TABLE <table-name> SET TBLPROPERTIES ('delta.isolationLevel' = 'Serializable')