複製到 (Azure Databricks 上的 Delta Lake)COPY INTO (Delta Lake on Azure Databricks)

重要

這項功能處於公開預覽狀態This feature is in Public Preview.

從檔案位置將資料載入 Delta 資料表。Loads data from a file location into a Delta table. 這是重新 triable 且等冪作業:已略過來源位置中已載入的檔案。This is a re-triable and idempotent operation—files in the source location that have already been loaded are skipped.

SyntaxSyntax

COPY INTO table_identifier
  FROM [ file_location | (SELECT identifier_list FROM file_location) ]
  FILEFORMAT = data_source
  [FILES = [file_name, ... | PATTERN = 'regex_pattern']
  [FORMAT_OPTIONS ('data_source_reader_option' = 'value', ...)]
  [COPY_OPTIONS 'force' = ('false'|'true')]
  • table_identifiertable_identifier

    • [database_name.] table_name:資料表名稱(選擇性地使用資料庫名稱限定)。[database_name.] table_name: A table name, optionally qualified with a database name.
    • delta.`<path-to-table>`:現有 Delta 資料表的位置。delta.`<path-to-table>`: The location of an existing Delta table.
  • 從 file_locationFROM file_location

    要載入資料的來源檔案位置。The file location to load the data from. 此位置中的檔案必須具有中指定的格式 FILEFORMATFiles in this location must have the format specified in FILEFORMAT.

  • 選取 identifier_listSELECT identifier_list

    在複製到差異資料表之前,從來源資料選取指定的資料行或運算式。Selects the specified columns or expressions from the source data before copying into the Delta table.

  • >FILEFORMAT = data_sourceFILEFORMAT = data_source

    要載入之原始程式檔的格式。The format of the source files to load. 、、、、的其中一個 CSV JSON AVRO ORC PARQUETOne of CSV, JSON, AVRO, ORC, PARQUET.

  • FILES

    要載入的檔案名清單,長度上限為1000。A list of file names to load, with length up to 1000. 指定時無法搭配 PATTERNCannot be specified with PATTERN.

  • 模式PATTERN

    用來識別要從來原始目錄載入之檔案的 RegEx 模式。A regex pattern that identifies the files to load from the source directory. 指定時無法搭配 FILESCannot be specified with FILES.

  • FORMAT_OPTIONSFORMAT_OPTIONS

    要傳遞給所指定格式之 Apache Spark 資料來源讀取器的選項。Options to be passed to the Apache Spark data source reader for the specified format.

  • COPY_OPTIONSCOPY_OPTIONS

    控制命令操作的選項 COPY INTOOptions to control the operation of the COPY INTO command. 唯一的選項是 'force' ; 如果設定為 'true' ,則會停用等冪性,而不論是否已載入檔案,都會載入檔案。The only option is 'force'; if set to 'true', idempotency is disabled and files are loaded regardless of whether they’ve been loaded before.

範例Examples

COPY INTO delta.`target_path`
  FROM (SELECT key, index, textData, 'constant_value' FROM 'source_path')
  FILEFORMAT = CSV
  PATTERN = 'folder1/file_[a-g].csv'
  FORMAT_OPTIONS('header' = 'true')