使用脚本组件创建同步转换Creating a Synchronous Transformation with the Script Component

适用于:Applies to: 是SQL ServerSQL Server(所有支持的版本)yesSQL ServerSQL Server (all supported versions) 是 Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factoryyes Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factory适用于:Applies to: 是SQL ServerSQL Server(所有支持的版本)yesSQL ServerSQL Server (all supported versions) 是 Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factoryyes Azure 数据工厂中的 SSIS Integration RuntimeSSIS Integration Runtime in Azure Data Factory

Integration ServicesIntegration Services 包的数据流中使用转换组件可以在数据从源传递到目标时修改和分析该数据。You use a transformation component in the data flow of an Integration ServicesIntegration Services package to modify and analyze data as it passes from source to destination. 具有同步输出的转换在每个输入行传递给该组件时对该行进行处理。A transformation with synchronous outputs processes each input row as it passes through the component. 具有异步输出的转换在等到接收所有输入行之后才能完成处理。A transformation with asynchronous outputs waits until it has received all input rows to complete its processing. 本主题讨论同步转换。This topic discusses a synchronous transformation. 有关异步转换的信息,请参阅使用脚本组件创建异步转换For information about asynchronous transformations, see Creating an Asynchronous Transformation with the Script Component. 有关同步组件和异步组件之间的差异的详细信息,请参阅了解同步和异步转换For more information about the difference between synchronous and asynchronous components, see Understanding Synchronous and Asynchronous Transformations.

有关脚本组件的概述,请参阅使用脚本组件扩展数据流For an overview of the Script component, see Extending the Data Flow with the Script Component.

脚本组件及其生成的基础结构代码可以大大简化自定义数据流组件的开发过程。The Script component and the infrastructure code that it generates for you simplify significantly the process of developing a custom data flow component. 但是,若要了解脚本组件的工作方式,读取以下步骤很有帮助,这些步骤是在开发自定义数据流组件部分,特别是在开发具有同步输出的自定义转换组件部分开发自定义数据流组件必须遵循的。However, to understand how the Script component works, you may find it useful to read the steps that you must follow in developing a custom data flow component in the section on Developing a Custom Data Flow Component, and especially Developing a Custom Transformation Component with Synchronous Outputs.

开始一个同步转换组件Getting Started with a Synchronous Transformation Component

SSISSSIS 设计器的“数据流”窗格添加脚本组件时,“选择脚本组件类型” 对话框将打开,并提示选择源、目标或转换组件类型。When you add a Script component to the Data Flow pane of SSISSSIS Designer, the Select Script Component Type dialog box opens and prompts you to select a Source, Destination, or Transformation component type. 在此对话框中,选择“转换” 。In this dialog box, select Transformation.

在元数据设计模式下配置同步转换组件Configuring a Synchronous Transformation Component in Metadata-Design Mode

选择创建转换组件的选项后,可使用“脚本转换编辑器” 配置该组件。After you select the option to create a transformation component, you configure the component by using the Script Transformation Editor. 有关详细信息,请参阅在脚本组件编辑器中配置脚本组件For more information, see Configuring the Script Component in the Script Component Editor.

若要设置脚本组件的脚本语言,请在“脚本转换编辑器”的“脚本”页上设置“ScriptLanguage”属性。To set the script language for the Script component, you set the ScriptLanguage property on the Script page of the Script Transformation Editor.

备注

若要设置脚本组件的默认脚本语言,请使用“选项”对话框的“常规”页上的“脚本语言”选项。To set the default scripting language for the Script component, use the Scripting language option on the General page of the Options dialog box. 有关详细信息,请参阅常规页For more information, see eneral Page.

数据流转换组件有一个输入并支持一个或多个输出。A data flow transformation component has one input, and supports one or more outputs. 在写入自定义脚本之前,必须在元数据设计模式下完成的一个步骤是使用“脚本转换编辑器” 配置组件的输入和输出。Configuring the input and outputs for the component is one of the steps that you must complete in metadata design mode, by using the Script Transformation Editor, before you write your custom script.

配置输入列Configuring Input Columns

转换组件具有一个输入。A transformation component has one input.

在“脚本转换编辑器”的“输入列”页中,列列表显示数据流上游组件的输出中可用的列。On the Input Columns page of the Script Transformation Editor, the column list shows the available columns from the output of the upstream component in the data flow. 选择要转换或传递的列。Select the columns that you want to transform or pass through. 将要就地转换的所有列标记为读/写。Mark any columns that you want to transform in place as Read/Write.

有关“脚本转换编辑器”的“输入列”页的详细信息,请参阅脚本转换编辑器(“输入列”页)For more information about the Input Columns page of the Script Transformation Editor, see Script Transformation Editor (Input Columns Page).

配置输入、输出和输出列Configuring Inputs, Outputs, and Output Columns

转换组件支持一个或多个输出。A transformation component supports one or more outputs.

在“脚本转换编辑器”的“输入和输出”页中,可以看到已创建一个输出,但是还没有创建任何输出列。On the Inputs and Outputs page of the Script Transformation Editor, you can see that a single output has been created, but the output has no columns. 在编辑器的这一页,可能需要或希望配置以下各项。On this page of the editor, you may need or want to configure the following items.

  • 创建一个或多个附加输出,如包含意外值的行的模拟错误输出。Create one or more additional outputs, such as a simulated error output for rows that contain unexpected values. 使用“添加输出” 和“删除输出” 按钮以管理同步转换组件的输出。Use the Add Output and Remove Output buttons to manage the outputs of your synchronous transformation component. 所有输入行都定向到所有可用输出,除非您表示希望将每一行重定向到一个输出或其他输出。All input rows are directed to all available outputs unless you indicate that you intend to redirect each row to one output or the other. 可通过为输出上的 ExclusionGroup 属性指定一个非零的整数值来指示希望重定向行。You indicate that you intend to redirect rows by specifying a non-zero integer value for the ExclusionGroup property on the outputs. 在 ExclusionGroup 中输入的用于标识输出的特定整数值并不重要,但是必须对指定的输出组使用同一个整数。The specific integer value entered in ExclusionGroup to identify the outputs is not significant, but you must use the same integer consistently for the specified group of outputs.

    备注

    如果不希望输出所有行,还可以对单个输出使用非零 ExclusionGroup 属性值 。You can also use a non-zero ExclusionGroup property value with a single output when you do not want to output all rows. 但是,在这种情况下,你必须为想要发送给输出的每一行显式调用 DirectRowTo<outputbuffer> 方法。However, in this case, you must explicitly call the DirectRowTo<outputbuffer> method for each row that you want to send to the output.

  • 为输入和输出指定一个更具说明性的名称。Assign a more descriptive name to the input and outputs. 脚本组件可以使用这些名称来生成类型化取值函数属性,这些属性用于在脚本中引用输入和输出。The Script component uses these names to generate the typed accessor properties that you will use to refer to the input and outputs in your script.

  • 对于同步转换,将列保持原样。Leave columns as is for synchronous transformations. 通常,同步转换不会向数据流添加列。Typically a synchronous transformation does not add columns to the data flow. 会在缓冲区中就地修改数据,然后将缓冲区传递到数据流的下一个组件中。Data is modified in place in the buffer, and the buffer is passed on to the next component in the data flow. 如果是这种情况,您不需要对转换的输出显式添加和配置输出列。If this is the case, you do not have to add and configure output columns explicitly on the transformation's outputs. 输出显示在编辑器中,并且不包含任何显式定义的列。The outputs appear in the editor without any explicitly defined columns.

  • 向行级错误的模拟错误输出添加新列。Add new columns to simulated error outputs for row-level errors. 通常,同一 ExclusionGroup 中的多个输出具有相同的输出列集 。Ordinarily multiple outputs in the same ExclusionGroup have the same set of output columns. 但是,如果要创建模拟的错误输出,则可能要添加多个列来包含错误信息。However, if you are creating a simulated error output, you may want to add more columns to contain error information. 有关数据流引擎如何处理错误行的信息,请参阅在数据流组件中使用错误输出For information about how the data flow engine processes error rows, see Using Error Outputs in a Data Flow Component. 请注意,在脚本组件中,必须编写您自己的代码以便使用适当的错误信息填充这些附加列。Note that in the Script component you must write your own code to fill the additional columns with appropriate error information. 有关详细信息,请参阅模拟脚本组件的错误输出For more information, see Simulating an Error Output for the Script Component.

有关“脚本转换编辑器” 的“输入和输出” 页上的详细信息,请参阅脚本转换编辑器(“输入和输出”页)For more information about the Inputs and Outputs page of the Script Transformation Editor, see Script Transformation Editor (Inputs and Outputs Page).

添加变量Adding Variables

如果要在脚本中使用现有的变量,可以在“脚本转换编辑器”的“脚本”页上的 ReadOnlyVariables 和 ReadWriteVariables 属性字段中添加这些变量 。If you want to use existing variables in your script, you can add them in the ReadOnlyVariables and ReadWriteVariables property fields on the Script page of the Script Transformation Editor.

在属性字段中添加多个变量时,请用逗号将变量名隔开。When you add multiple variables in the property fields, separate the variable names by commas. 还可以选择多个变量,方法是单击 ReadOnlyVariables 和 ReadWriteVariables 属性字段旁的省略号 (...) 按钮,然后在“选择变量”对话框中选择变量 。You can also select multiple variables by clicking the ellipsis (...) button next to the ReadOnlyVariables and ReadWriteVariables property fields, and then selecting the variables in the Select variables dialog box.

有关如何在脚本组件中使用变量的常规信息,请参阅在脚本组件中使用变量For general information about how to use variables with the Script component, see Using Variables in the Script Component.

有关“脚本转换编辑器”的“脚本”页的详细信息,请参阅脚本转换编辑器(“脚本”页)For more information about the Script page of the Script Transformation Editor, see Script Transformation Editor (Script Page).

在代码设计模式下编写同步转换组件脚本Scripting a Synchronous Transformation Component in Code-Design Mode

为组件配置完元数据后,可以编写自定义脚本。After you have configured the metadata for your component, you can write your custom script. 在“脚本转换编辑器”的“脚本”页面中,单击“编辑脚本”打开 MicrosoftMicrosoft Visual StudioVisual Studio Tools for Applications (VSTA) IDE,可在其中添加自定义脚本 。In the Script Transformation Editor, on the Script page, click Edit Script to open the MicrosoftMicrosoft Visual StudioVisual Studio Tools for Applications (VSTA) IDE where you can add your custom script. 编写脚本所使用的语言取决于为“脚本”页上的 ScriptLanguage 属性选择 MicrosoftMicrosoft Visual Basic 还是 MicrosoftMicrosoft Visual C# 作为脚本语言。The scripting language that you use depends on whether you selected MicrosoftMicrosoft Visual Basic or MicrosoftMicrosoft Visual C# as the script language for the ScriptLanguage property on the Script page.

有关适用于使用脚本组件创建的所有组件类型的重要信息,请参阅脚本组件的编码和调试For important information that applies to all kinds of components created by using the Script component, see Coding and Debugging the Script Component.

了解自动生成的代码Understanding the Auto-generated Code

创建并配置转换组件后打开 VSTA IDE 时,可编辑的 ScriptMain 类会显示在代码编辑器中,其中有 ProcessInputRow 方法的存根。When you open the VSTA IDE after you create and configuring a transformation component, the editable ScriptMain class appears in the code editor with a stub for the ProcessInputRow method. 在 ScriptMain 类中可编写自定义代码,ProcessInputRow 是转换组件中最重要的方法 。The ScriptMain class is where you will write your custom code, and ProcessInputRow is the most important method in a transformation component.

如果在 VSTA 中打开“项目资源管理器” 窗口,可以看到脚本组件还生成了只读的 BufferWrapperComponentWrapper 项目项。If you open the Project Explorer window in VSTA, you can see that the Script component has also generated read-only BufferWrapper and ComponentWrapper project items. ScriptMain 类继承自 ComponentWrapper 项目项中的 UserComponent 类。The ScriptMain class inherits from the UserComponent class in the ComponentWrapper project item.

在运行时,数据流引擎调用 UserComponent 类中的 ProcessInput 方法,该方法替代 ScriptComponent 父类的 ProcessInput 方法。At run time, the data flow engine invokes the ProcessInput method in the UserComponent class, which overrides the ProcessInput method of the ScriptComponent parent class. 而 ProcessInput 方法遍历输入缓冲区中的所有行并为每一行调用一次 ProcessInputRow 方法。The ProcessInput method in turn loops through the rows in the input buffer and calls the ProcessInputRow method one time for each row.

编写自定义代码Writing Your Custom Code

具有同步传输的转换组件是要编写的所有数据流组件中最简单的部分。A transformation component with synchronous outputs is the simplest of all data flow components to write. 例如,本主题中稍后演示的单个输出示例包含以下自定义代码:For example, the single-output example shown later in this topic consists of the following custom code:

Row.City = UCase(Row.City)  
Row.City = (Row.City).ToUpper();  
  

若要完成创建自定义同步转换组件,使用已重写的 ProcessInputRow 方法来转换输入缓冲区的每行中的数据。To finish creating a custom synchronous transformation component, you use the overridden ProcessInputRow method to transform the data in each row of the input buffer. 缓冲区写满后,数据流引擎会将此缓冲区传递给数据流中的下一个组件。The data flow engine passes this buffer, when full, to the next component in the data flow.

根据用户的要求,可能还需要在 ScriptMain 类中可用的 PreExecute 和 PostExecute 方法中编写脚本来执行预处理或最终处理 。Depending on your requirements, you may also want to write script in the PreExecute and PostExecute methods, available in the ScriptMain class, to perform preliminary or final processing.

使用多个输出Working with Multiple Outputs

将输入行定向到两个或多个可能的输出中的一个输出时所需的自定义代码比之前讨论的单个输出方案所需的代码要少得多。Directing input rows to one of two or more possible outputs does not require much more custom code than the single-output scenario discussed earlier. 例如,本主题中稍后演示的两个输出示例包含以下自定义代码:For example, the two-output example shown later in this topic consists of the following custom code:

Row.City = UCase(Row.City)  
If Row.City = "REDMOND" Then  
    Row.DirectRowToMyRedmondAddresses()  
Else  
    Row.DirectRowToMyOtherAddresses()  
End If  
Row.City = (Row.City).ToUpper();  
  
if (Row.City=="REDMOND")  
{  
    Row.DirectRowToMyRedmondAddresses();  
}  
else  
{  
    Row.DirectRowToMyOtherAddresses();  
}  

在此示例中,脚本组件将基于已配置的输出名称生成 DirectRowTo<OutputBufferX> 方法。In this example, the Script component generates the DirectRowTo<OutputBufferX> methods for you, based on the names of the outputs that you configured. 您可以使用类似的代码将错误行定向到模拟的错误输出。You can use similar code to direct error rows to a simulated error output.

示例Examples

此示例演示在 ScriptMain 类中创建同步转换组件所需的自定义代码。The examples here demonstrate the custom code that is required in the ScriptMain class to create a synchronous transformation component.

备注

这些示例使用 AdventureWorks 示例数据库中的 Person.Address 表并在数据流中传递它的第一列和第四列,即 intAddressID 和 nvarchar(30)City 列 。These examples use the Person.Address table in the AdventureWorks sample database and pass its first and fourth columns, the intAddressID and nvarchar(30)City columns, through the data flow. 在本节中,在源、转换和目标示例中使用相同的数据。The same data is used in the source, transformation, and destination samples in this section. 每个示例的其他前提条件和假设都记录在文档中。Additional prerequisites and assumptions are documented for each example.

单个输出同步转换示例Single Output Synchronous Transformation Example

此示例演示具有单个输出的同步转换组件。This example demonstrates a synchronous transformation component with a single output. 此转换将传递 AddressID 列,并将 City 列转换为大写。This transformation passes through the AddressID column and converts the City column to uppercase.

如果要运行此示例代码,必须按照如下方式配置包和组件:If you want to run this sample code, you must configure the package and the component as follows:

  1. 向数据流设计器图面添加新的脚本组件并将其配置为转换。Add a new Script component to the Data Flow designer surface and configure it as a transformation.

  2. SSISSSIS 设计器中将源或另一转换的输出连接到新转换组件。Connect the output of a source or of another transformation to the new transformation component in SSISSSIS Designer. 此输出应提供 AdventureWorks 示例数据库的 Person.Address 表中的数据,其中包含 AddressID 和 City 列。This output should provide data from the Person.Address table of the AdventureWorks sample database that contains the AddressID and City columns.

  3. 打开“脚本转换编辑器” 。Open the Script Transformation Editor. 在“输入列” 页中,选择 AddressID 和 City 列。On the Input Columns page, select the AddressID and City columns. 将 City 列标记为读取/写入。Mark the City column as Read/Write.

  4. 在“输入和输出” 页上,使用更具说明性的名称(如 MyAddressInput 和 MyAddressOutput 重命名输入和输出。On the Inputs and Outputs page, rename the input and output with more descriptive names, such as MyAddressInput and MyAddressOutput. 请注意,输出的 SynchronousInputID 与输入的 ID 相对应。Notice that the SynchronousInputID of the output corresponds to the ID of the input. 因此,您不需要添加和配置输出列。Therefore you do not have to add and configure output columns.

  5. 在“脚本”页中,单击“编辑脚本”并输入下面的脚本 。On the Script page, click Edit Script and enter the script that follows. 然后关闭脚本开发环境和“脚本转换编辑器” 。Then close the script development environment and the Script Transformation Editor.

  6. 创建并配置目标组件,如 SQL ServerSQL Server 目标或 使用脚本组件创建目标 中演示的示例目标组件,该目标组件需要 AddressID 和 City 列。Create and configure a destination component that expects the AddressID and City columns, such as a SQL ServerSQL Server destination, or the sample destination component demonstrated in Creating a Destination with the Script Component. 然后,将转换的输出连接到目标组件。Then connect the output of the transformation to the destination component. 在 AdventureWorks 数据库中运行以下 Transact-SQLTransact-SQL 命令,以创建目标表:You can create a destination table by running the following Transact-SQLTransact-SQL command in the AdventureWorks database:

    CREATE TABLE [Person].[Address2]([AddressID] [int] NOT NULL,  
        [City] [nvarchar](30) NOT NULL)  
    
  7. 运行该示例。Run the sample.

Public Class ScriptMain  
    Inherits UserComponent  
  
    Public Overrides Sub MyAddressInput_ProcessInputRow(ByVal Row As MyAddressInputBuffer)  
  
        Row.City = UCase(Row.City)  
  
    End Sub  
  
End Class  
public class ScriptMain:  
    UserComponent  
  
{  
    public override void MyAddressInput_ProcessInputRow(MyAddressInputBuffer Row)  
    {  
  
        Row.City = (Row.City).ToUpper();  
  
    }  
  
}  

两个输出同步转换示例Two-Output Synchronous Transformation Example

此示例演示具有两个输出的同步转换组件。This example demonstrates a synchronous transformation component with two outputs. 此转换将传递 AddressID 列,并将 City 列转换为大写。This transformation passes through the AddressID column and converts the City column to uppercase. 如果城市名为 Redmond,则将行定向到一个输出;将所有其他行定向到另一个输出。If the city name is Redmond, it directs the row to one output; it directs all other rows to another output.

如果要运行此示例代码,必须按照如下方式配置包和组件:If you want to run this sample code, you must configure the package and the component as follows:

  1. 向数据流设计器图面添加新的脚本组件并将其配置为转换。Add a new Script component to the Data Flow designer surface and configure it as a transformation.

  2. SSISSSIS 设计器中将源或另一转换的输出连接到新转换组件。Connect the output of a source or of another transformation to the new transformation component in SSISSSIS Designer. 此输出应提供 AdventureWorks 示例数据库的 Person.Address 表中的数据,其中至少包含 AddressID 和 City 列。This output should provide data from the Person.Address table of the AdventureWorks sample database that contains at least the AddressID and City columns.

  3. 打开“脚本转换编辑器” 。Open the Script Transformation Editor. 在“输入列” 页中,选择 AddressID 和 City 列。On the Input Columns page, select the AddressID and City columns. 将 City 列标记为读取/写入。Mark the City column as Read/Write.

  4. 在“输入和输出” 页中,创建第二个输出。On the Inputs and Outputs page, create a second output. 添加新的输出后,请确保将该输出的 SynchronousInputID 设置为输入的 ID 。After you add the new output, make sure that you set its SynchronousInputID to the ID of the input. 已对第一个输出设置此属性,该输出是默认创建的。This property is already set on the first output, which is created by default. 对于每个输出,将 ExclusionGroup 属性设置为同一非零值,以指示将在两个互相排斥的输出间拆分输入行。For each output, set the ExclusionGroup property to the same non-zero value to indicate that you will split the input rows between two mutually exclusive outputs. 无需向输出添加任何输出列。You do not have to add any output columns to the outputs.

  5. 使用更具说明性的名称(如 MyAddressInput 、MyRedmondAddresses 和 MyOtherAddresses )重命名输入和输出。Rename the input and outputs with more descriptive names, such as MyAddressInput, MyRedmondAddresses, and MyOtherAddresses.

  6. 在“脚本”页中,单击“编辑脚本”并输入下面的脚本 。On the Script page, click Edit Script and enter the script that follows. 然后关闭脚本开发环境和“脚本转换编辑器” 。Then close the script development environment and the Script Transformation Editor.

  7. 创建并配置两个目标组件,如 SQL ServerSQL Server 目标、平面文件目标或在使用脚本组件创建目标中演示的示例目标组件,这两个目标组件需要 AddressID 和 City 列。Create and configure two destination components that expect the AddressID and City columns, such as a SQL ServerSQL Server destination, a Flat File destination, or the sample destination component demonstrated in Creating a Destination with the Script Component. 然后,将转换的每个输出连接到任一目标组件。Then connect each of the outputs of the transformation to one of the destination components. 可在 AdventureWorks 数据库中运行与下列命令(具有唯一表名)类似的 Transact-SQLTransact-SQL 命令来创建目标表:You can create destination tables by running a Transact-SQLTransact-SQL command similar to the following (with unique table names) in the AdventureWorks database:

    CREATE TABLE [Person].[Address2](  
        [AddressID] [int] NOT NULL,  
        [City] [nvarchar](30) NOT NULL  
    
  8. 运行该示例。Run the sample.

Public Class ScriptMain  
    Inherits UserComponent  
  
    Public Overrides Sub MyAddressInput_ProcessInputRow(ByVal Row As MyAddressInputBuffer)  
  
        Row.City = UCase(Row.City)  
  
        If Row.City = "REDMOND" Then  
            Row.DirectRowToMyRedmondAddresses()  
        Else  
            Row.DirectRowToMyOtherAddresses()  
        End If  
  
    End Sub  
  
End Class  
public class ScriptMain:  
    UserComponent  
  
public override void MyAddressInput_ProcessInputRow(MyAddressInputBuffer Row)  
    {  
  
        Row.City = (Row.City).ToUpper();  
  
        if (Row.City == "REDMOND")  
        {  
            Row.DirectRowToMyRedmondAddresses();  
        }  
        else  
        {  
            Row.DirectRowToMyOtherAddresses();  
        }  
  
    }  
}  

另请参阅See Also

了解同步和异步转换Understanding Synchronous and Asynchronous Transformations
使用脚本组件创建异步转换Creating an Asynchronous Transformation with the Script Component
开发具有同步输出的自定义转换组件Developing a Custom Transformation Component with Synchronous Outputs