XML 源XML Source

适用于:Applies to: 是SQL ServerSQL Server(所有支持的版本)yesSQL ServerSQL Server (all supported versions) 是 Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factoryyes Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factory适用于:Applies to: 是SQL ServerSQL Server(所有支持的版本)yesSQL ServerSQL Server (all supported versions) 是 Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factoryyes Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factory

XML 源读取 XML 数据文件,并用数据填充源输出中的列。The XML source reads an XML data file and populates the columns in the source output with the data.

XML 文件中的数据常常包含层次结构关系。The data in XML files frequently includes hierarchical relationships. 例如,XML 数据文件可以表示目录和目录中的项。For example, an XML data file can represent catalogs and items in catalogs. 必须先确定 XML 数据文件中元素的关系,并且为文件中的每个元素都生成了一个输出,数据才能进入数据流。Before the data can enter the data flow, the relationship of the elements in XML data file must be determined, and an output must be generated for each element in the file.

架构Schemas

XML 源使用某种架构来解释 XML 数据。The XML source uses a schema to interpret the XML data. XML 源支持使用 XML 架构定义 (XSD) 文件或内联架构将 XML 数据解释为表格格式。The XML source supports use of a XML Schema Definition (XSD) file or inline schemas to translate the XML data into a tabular format. 如果使用 “XML 源编辑器” 对话框配置 XML 源,用户界面可以根据指定的 XML 数据文件生成 XSD。If you configure the XML source by using the XML Source Editor dialog box, the user interface can generate an XSD from the specified XML data file.

备注

不支持 DTD。DTDs are not supported.

这些架构仅可以支持一个命名空间;不支持架构集合。The schemas can support a single namespace only; they do not support schema collections.

备注

XML 源并不根据 XSD 来验证 XML 文件中的数据。The XML source does not validate the data in the XML file against the XSD.

XML 源编辑器XML Source Editor

XML 文件中的数据常常包含层次结构关系。The data in the XML files frequently includes hierarchical relationships. “XML 源编辑器” 对话框使用指定的架构生成 XML 源输出。The XML Source Editor dialog box uses the specified schema to generate the XML source outputs. 您可以指定 XSD 文件,使用内联架构或根据指定的 XML 数据文件生成 XSD。You can specify an XSD file, use an inline schema, or generate an XSD from the specified XML data file. 该架构必须在设计时可用。The schema must be available at design time.

XML 源通过为 XML 文件中每个包含其他元素的元素创建一个输出,以便根据 XML 数据生成表格结构。The XML source generates tabular structures from the XML data by creating an output for every element that contains other elements in the XML files. 例如,如果 XML 数据表示目录和目录中的项,则 XML 源将为目录创建一个输出,而且为目录包含的每种类型的项都创建一个输出。For example, if the XML data represents catalogs and items in catalogs, the XML source creates an output for catalogs and an output for each type of item that the catalogs contain. 每项的输出将包含该项的属性的输出列。The output of each item will contain output columns for the attributes of that item.

为了在输出中提供关于数据层次结构关系的信息,XML 源在输出中添加了一个为各个子元素标识父元素的列。To provide information about the hierarchical relationship of the data in the outputs, the XML source adds a column in the outputs that identifies the parent element for each child element. 通过使用带有不同类型项的目录示例,每个项将具有一个用于标识该项所属目录的列值。Using the example of catalogs with different types of items, each item would have a column value that identifies the catalog to which it belongs.

XML 源为每个元素创建一个输出,但您不需要使用所有输出。The XML source creates an output for every element, but it is not required that you use all the outputs. 您可以删除不想使用的输出,或只是不将其连接到下游组件。You can delete any output that you do not want to use, or just not connect it to a downstream component.

XMl 源还生成输出名称,以确保输出名称明确。The XML source also generates the output names, to ensure that the names are unambiguous. 这些名称可能比较长,而且可能没有以对您有用的方式标识输出。These names may be long and may not identify the outputs in a way that is useful to you. 可以重命名输出,但是要保持其名称的唯一性。You can rename the outputs, as long as their names remain unique. 还可以修改输出列的数据类型和长度。You can also modify the data type and the length of output columns.

对于每个输出,XML 源都会添加一个错误输出。For every output, the XML source adds an error output. 默认情况下,错误输出中的列的数据类型为 Unicode 字符串数据类型 (DT_WSTR),列的长度为 255 个字符,但您可以通过修改列的数据类型和长度来配置错误输出中的列。By default the columns in error outputs have Unicode string data type (DT_WSTR) with a length of 255, but you can configure the columns in the error outputs by modifying their data type and length.

如果 XML 数据文件包含 XSD 中没有的元素,则这些元素将被忽略,并且不会为其生成输出。If the XML data file contains elements that are not in the XSD, these elements are ignored and no output is generated for them. 另一方面,如果 XML 数据文件缺少 XSD 中存在的元素,则输出将包含空值列。On the other hand, if the XML data file is missing elements that are represented in the XSD, the output will contain columns with null values.

从 XML 数据文件提取数据后,数据将转换为 Integration ServicesIntegration Services 数据类型。When the data is extracted from the XML data file, it is converted to an Integration ServicesIntegration Services data type. 不过,XML 源不能将 XML 数据转换为 DT_TIME2 或 DT_DBTIMESTAMP2 数据类型,这是因为源不支持这些数据类型。However, the XML source cannot convert the XML data to the DT_TIME2 or DT_DBTIMESTAMP2 data types because the source does not support these data types. 有关详细信息,请参阅 Integration Services 数据类型For more information, see Integration Services Data Types.

XSD 或内联架构可能为元素指定了数据类型,但如果未指定,则“XML 源编辑器” 对话框将为输出中包含该元素的列指定 Unicode 字符串数据类型 (DT_WSTR),并将列长度设置为 255 个字符。The XSD or inline schema may specify the data type for elements, but if it does not, the XML Source Editor dialog box assigns the Unicode string data type (DT_WSTR) to the column in the output that contains the element, and sets the column length to 255 characters.

如果该架构指定了元素的最大长度,则输出列的长度将设置为此值。If the schema specifies the maximum length of an element, the length of output column is set to this value. 如果最大长度大于将元素转换为的 Integration ServicesIntegration Services 数据类型所支持的长度,则数据将被截断为该数据类型的最大长度。If the maximum length is greater than the length supported by the Integration ServicesIntegration Services data type to which the element is converted, then the data is truncated to the maximum length of the data type. 例如,如果一个字符串的长度为 5000,则该字符串将因为 DT_WSTR 数据类型的最大长度而截断为 4000 字符;类似地,字节数据将截断为 DT_BYTES 数据类型的最大长度 8000 字符。For example, if a string has a length of 5000, it is truncated to 4000 characters because the maximum length of the DT_WSTR data type is 4000 characters; likewise, byte data is truncated to 8000 characters, the maximum length of the DT_BYTES data type. 如果架构未指定最大长度,则具有任何一种数据类型的列的默认长度都设置为 255。If the schema specifies no maximum length, the default length of columns with either data type is set to 255. 对 XML 源中的数据截断的处理方式与其他数据流组件中截断的处理方式相同。Data truncation in the XML source is handled the same way as truncation in other data flow components. 有关详细信息,请参阅 数据中的错误处理For more information, see Error Handling in Data.

您可以修改数据类型和列长度。You can modify the data type and the column length. 有关详细信息,请参阅 Integration Services 数据类型For more information, see Integration Services Data Types.

XML 源的配置Configuration of the XML Source

XML 源支持三种不同的数据访问模式。The XML source supports three different data access modes. 您可以指定 XML 数据文件的文件位置、包含文件位置的变量或包含 XML 数据的变量。You can specify the file location of the XML data file, the variable that contains the file location, or the variable that contains the XML data.

XML 源包括可以在加载包时通过属性表达式进行更新的 XMLDataXMLSchemaDefinition 自定义属性。The XML source includes the XMLData and XMLSchemaDefinition custom properties that can be updated by property expressions when the package is loaded. 有关详细信息,请参阅 Integration Services (SSIS) 表达式在包中使用属性表达式XML 源自定义属性For more information, see Integration Services (SSIS) Expressions, Use Property Expressions in Packages, and XML Source Custom Properties.

XML 源支持多个常规输出和多个错误输出。The XML source supports multiple regular outputs and multiple error outputs.

SQL ServerSQL Server Integration ServicesIntegration Services 包含用于配置 XML 源的“XML 源编辑器”对话框。Integration ServicesIntegration Services includes the XML Source Edito r dialog box for configuring the XML source. 此对话框在 SSISSSIS 设计器中可用。This dialog box is available in SSISSSIS Designer.

可以通过 SSISSSIS 设计器或以编程方式来设置属性。You can set properties through SSISSSIS Designer or programmatically.

“高级编辑器” 对话框反映了可以通过编程方式进行设置的属性。The Advanced Editor dialog box reflects the properties that can be set programmatically. 有关可以在 “高级编辑器” 对话框中或以编程方式设置的属性的详细信息,请单击下列主题之一:For more information about the properties that you can set in the Advanced Editor dialog box or programmatically, click one of the following topics:

有关如何设置属性的详细信息,请单击下列主题之一:For more information about how to set the properties, click one of the following topics:

XML 源编辑器(“连接管理器”页)XML Source Editor (Connection Manager Page)

可以使用 “XML 源编辑器”“连接管理器” 页指定 XML 文件和转换 XML 数据的 XSD。Use the Connection Manager page of the XML Source Editor to specify an XML file and the XSD that transforms the XML data.

静态选项Static Options

数据访问模式Data access mode
指定从源选择数据的方法。Specify the method for selecting data from the source.

Value 描述Description
XML 文件位置XML file location 从 XML 文件检索数据。Retrieve data from an XML file.
来自变量的 XML 文件XML file from variable 在变量中指定 XML 文件名。Specify the XML file name in a variable.

相关信息在包中使用变量Related information: Use Variables in Packages
来自变量的 XML 数据XML data from variable 从变量检索 XML 数据。Retrieve XML data from a variable.

使用内联架构Use inline schema
指定 XML 源数据本身是否包含 XSD 架构(用于定义和验证 XML 源数据的结构和数据)。Specify whether the XML source data itself contains the XSD schema that defines and validates its structure and data.

XSD 位置XSD location
键入 XSD 架构文件的路径和文件名,或者可以单击“浏览”定位该文件。Type the path and file name of the XSD schema file, or locate the file by clicking Browse.

“浏览”Browse
使用“打开”对话框定位到 XSD 架构文件。Use the Open dialog box to locate the XSD schema file.

生成 XSDGenerate XSD
使用“另存为”对话框可以为自动生成的 XSD 架构文件选择位置。Use the Save As dialog box to select a location for the auto-generated XSD schema file. 编辑器将根据 XML 数据的结构来推断架构。The editor infers the schema from the structure of the XML data.

数据访问模式动态选项Data Access Mode Dynamic Options

数据访问模式 = XML 文件位置Data access mode = XML file location

XML 位置XML location
键入 XML 数据文件的路径和文件名,或者通过单击“浏览”查找文件。Type the path and file name of the XML data file, or locate the file by clicking Browse.

“浏览”Browse
使用“打开”对话框定位到 XML 数据文件。Use the Open dialog box to locate the XML data file.

数据访问模式 = 来自变量的 XML 文件Data access mode = XML file from variable

变量名称Variable name
选择包含 XML 文件的路径和文件名的变量。Select the variable that contains the path and file name of the XML file.

数据访问模式 = 来自变量的 XML 数据Data access mode = XML data from variable

变量名称Variable name
选择包含 XML 数据的变量。Select the variable that contains the XML data.

XML 源编辑器(“列”页)XML Source Editor (Columns Page)

可以使用“XML 源编辑器”对话框的“列”节点,将输出列映射到外部(源)列。Use the Columns node of the XML Source Editor dialog box to map an output column to an external (source) column.

选项Options

可用外部列Available External Columns
查看数据源中可用外部列的列表。View the list of available external columns in the data source. 无法使用此表添加或删除列。You cannot use this table to add or delete columns.

“外部列”External Column
按任务读取外部(源)列的顺序查看这些列。View external (source) columns in the order in which the task will read them. 首先在编辑器中显示的表中清除所选择的列,然后以不同的顺序从列表中选择外部列,即可更改顺序。You can change this order by first clearing the selected columns in the table displayed in the editor, and then selecting external columns from the list in a different order.

输出列Output Column
为每个输出列提供唯一的名称。Provide a unique name for each output column. 默认值为所选外部(源)列的名称;不过,您也可以任选一个唯一的描述性名称。The default is the name of the selected external (source) column; however, you can choose any unique, descriptive name. 所提供的名称将在 SSISSSIS 设计器中显示。The name provided will be displayed within SSISSSIS Designer.

XML 源编辑器(“错误输出”页)XML Source Editor (Error Output Page)

可以使用 “XML 源编辑器” 对话框的 “错误输出” 页选择错误处理选项,以及设置错误输出列的属性。Use the Error Output page of the XML Source Editor dialog box to select error handling options and to set properties on error output columns.

选项Options

输入/输出Input/Output
查看数据源的名称。View the name of the data source.

Column
查看在“XML 源编辑器”对话框中“连接管理器”页上选择的外部(源)列。View the external (source) columns that you selected on the Connection Manager page of the XML Source Editor dialog box.

错误Error
指定发生错误时应执行的操作:忽略失败、重定向行或使组件失败。Specify what should happen when an error occurs: ignore the failure, redirect the row, or fail the component.

相关主题: 数据中的错误处理Related Topics: Error Handling in Data

截断Truncation
指定发生截断时应执行的操作:忽略失败、重定向行或使组件失败。Specify what should happen when a truncation occurs: ignore the failure, redirect the row, or fail the component.

说明Description
查看对错误的说明。View the description of the error.

将此值设置到选定的单元格Set this value to selected cells
指定发生错误或截断时应对所有选定单元格执行的操作:忽略失败、重定向行或使组件失败。Specify what should happen to all the selected cells when an error or truncation occurs: ignore the failure, redirect the row, or fail the component.

应用Apply
将错误处理选项应用到选定的单元格。Apply the error handling option to the selected cells.

使用 XML 源提取数据Extract Data by Using the XML Source