PREDICT (Transact-SQL)PREDICT (Transact-SQL)

适用于:Applies to: 是SQL Server 2017 (14.x)SQL Server 2017 (14.x)yesSQL Server 2017 (14.x)SQL Server 2017 (14.x) 及更高版本 是Azure SQL 数据库Azure SQL DatabaseYesAzure SQL 数据库Azure SQL Database 是Azure SQL 托管实例Azure SQL Managed InstanceYesAzure SQL 托管实例Azure SQL Managed Instance 是Azure Synapse AnalyticsAzure Synapse AnalyticsyesAzure Synapse AnalyticsAzure Synapse Analytics适用于:Applies to: 是SQL Server 2017 (14.x)SQL Server 2017 (14.x)yesSQL Server 2017 (14.x)SQL Server 2017 (14.x) and later 是Azure SQL 数据库Azure SQL DatabaseYesAzure SQL 数据库Azure SQL Database 是Azure SQL 托管实例Azure SQL Managed InstanceYesAzure SQL 托管实例Azure SQL Managed Instance 是Azure Synapse AnalyticsAzure Synapse AnalyticsyesAzure Synapse AnalyticsAzure Synapse Analytics

基于存储模型生成预测值或评分。Generates a predicted value or scores based on a stored model. 有关详细信息,请参阅使用 PREDICT T-SQL 函数本机计分For more information, see Native scoring using the PREDICT T-SQL function.

语法Syntax

PREDICT  
(  
  MODEL = @model | model_literal,  
  DATA = object AS <table_alias>
  [, RUNTIME = ONNX ]
)  
WITH ( <result_set_definition> )  

<result_set_definition> ::=  
  {  
    { column_name  
      data_type  
      [ COLLATE collation_name ]  
      [ NULL | NOT NULL ]  
    }  
      [,...n ]  
  }  

MODEL = @model | model_literal  
PREDICT  
(  
  MODEL = <model_object>,
  DATA = object AS <table_alias>
  [, RUNTIME = ONNX ]
)  
WITH ( <result_set_definition> )  

<result_set_definition> ::=  
  {  
    { column_name  
      data_type  
      [ COLLATE collation_name ]  
      [ NULL | NOT NULL ]  
    }  
      [,...n ]  
  }  

<model_object> ::=
  {
    model_literal
    | model_variable
    | ( scalar_subquery )
  }

参数Arguments

MODELMODEL

MODEL 参数用于指定用于评分或预测的模型。The MODEL parameter is used to specify the model used for scoring or prediction. 将模型指定为变量或文字或标量表达式。The model is specified as a variable or a literal or a scalar expression.

PREDICT 支持使用 RevoScaleRrevoscalepy 包训练的模型。PREDICT supports models trained using the RevoScaleR and revoscalepy packages.

MODEL 参数用于指定用于评分或预测的模型。The MODEL parameter is used to specify the model used for scoring or prediction. 将模型指定为变量或文字或标量表达式。The model is specified as a variable or a literal or a scalar expression.

在 Azure SQL 托管实例中,PREDICT 支持 Open Neural Network Exchange (ONNX) 格式的模型,或使用 RevoScaleRrevoscalepy 包训练的模型。In Azure SQL Managed Instance, PREDICT supports models in Open Neural Network Exchange (ONNX) format or models trained using the RevoScaleR and revoscalepy packages.

MODEL 参数用于指定用于评分或预测的模型。The MODEL parameter is used to specify the model used for scoring or prediction. 将模型指定为变量或文字或标量表达式或标量子查询。The model is specified as a variable or a literal or a scalar expression or a scalar subquery.

在 Azure Synapse Analytics 中,PREDICT 支持 Open Neural Network Exchange (ONNX) 格式的模型。In Azure Synapse Analytics, PREDICT supports models in Open Neural Network Exchange (ONNX) format.

数据DATA

DATA 参数用于指定用于评分或预测的数据。The DATA parameter is used to specify the data used for scoring or prediction. 在查询中以表源的形式指定数据。Data is specified in the form of a table source in the query. 表源可以是表、表别名、CTE 别名、视图或表值函数.Table source can be a table, table alias, CTE alias, view, or table-valued function.

RUNTIME = ONNXRUNTIME = ONNX

重要

RUNTIME = ONNX 参数仅在 Azure SQL 托管实例Azure SQL EdgeAzure Synapse Analytics 中可用。The RUNTIME = ONNX argument is only available in Azure SQL Managed Instance, Azure SQL Edge, and Azure Synapse Analytics.

指示用于执行模型的机器学习引擎。Indicates the machine learning engine used for model execution. RUNTIME 参数值始终为 ONNXThe RUNTIME parameter value is always ONNX. 对于 Azure SQL Edge 和 Azure Synapse Analytics,此参数是必需的。The parameter is required for Azure SQL Edge and Azure Synapse Analytics. 在 Azure SQL 托管实例上,此参数是可选的,仅在使用 ONNX 模型时使用。On Azure SQL Managed Instance, the parameter is optional and only used when using ONNX models.

WITH ( <result_set_definition> )WITH ( <result_set_definition> )

WITH 子句用于指定 PREDICT 函数返回的输出的架构。The WITH clause is used to specify the schema of the output returned by the PREDICT function.

除了 PREDICT 函数本身返回的列之外,所有在数据输入中包含的列都可以在查询中使用。In addition to the columns returned by the PREDICT function itself, all the columns that are part of the data input are available for use in the query.

返回值Return values

没有可用的预定义架构;不验证模型的内容,也不验证返回的列值。No predefined schema is available; the contents of the model is not validated and the returned column values are not validated either.

  • PREDICT 函数作为输入通过列传递。The PREDICT function passes through columns as input.
  • PREDICT 函数还生成新列,但列数以及其数据类型取决于用于预测的模型类型。The PREDICT function also generates new columns, but the number of columns and their data types depends on the type of model that was used for prediction.

任何与数据、模型或列格式相关的错误消息都由与模型关联的基础预测函数返回。Any error messages related to the data, the model, or the column format are returned by the underlying prediction function associated with the model.

备注Remarks

Windows 和 Linux 上的所有版本的 SQL Server 2017 或更高版本都支持 PREDICT 功能。The PREDICT function is supported in all editions of SQL Server 2017 or later, on Windows and Linux. 无需启用机器学习服务即可使用 PREDICTMachine Learning Services does not need to be enabled to use PREDICT.

支持的算法Supported algorithms

使用的模型必须是使用 RevoScaleRrevoscalepy 包中支持的算法之一创建的。The model that you use must have been created using one of the supported algorithms from the RevoScaleR or revoscalepy packages. 若要查看当前支持的模型列表,请参阅使用 PREDICT T-SQL 函数本机评分For a list of currently supported models, see Native scoring using the PREDICT T-SQL function.

支持可转换为 ONNX 模型格式的算法。Algorithms that can be converted to ONNX model format are supported.

支持可转换为 ONNX 模型格式的算法,以及使用 RevoScaleRrevoscalepy 包中支持的算法之一创建的模型。Algorithms that can be converted to ONNX model format and models that you have created using one of the supported algorithms from the from the RevoScaleR or revoscalepy packages are supported. 若要查看 RevoScaleR 和 revoscalepy 中当前支持的算法的列表,请参阅使用 PREDICT T-SQL 函数本机评分For a list of currently supported algorithms in RevoScaleR and revoscalepy, see Native scoring using the PREDICT T-SQL function.

权限Permissions

PREDICT 不需要任何权限;但是,用户需要数据库 EXECUTE 权限,以及查询用作输入的任何数据的权限。No permissions are required for PREDICT; however, the user needs EXECUTE permission on the database, and permission to query any data that is used as inputs. 如果模型已存储在表中,则用户还须能够从表中读取模型。The user must also be able to read the model from a table, if the model has been stored in a table.

示例Examples

以下示例展示了调用 PREDICT 的语法。The following examples demonstrate the syntax for calling PREDICT.

在 FROM 子句中使用 PREDICTUsing PREDICT in a FROM clause

此示例引用 SELECT 语句的 FROM 子句中的 PREDICT 函数:This example references the PREDICT function in the FROM clause of a SELECT statement:

SELECT d.*, p.Score
FROM PREDICT(MODEL = @model,
    DATA = dbo.mytable AS d) WITH (Score FLOAT) AS p;
DECLARE @model VARBINARY(max) = (SELECT test_model FROM scoring_model WHERE model_id = 1);

SELECT d.*, p.Score
FROM PREDICT(MODEL = @model,
    DATA = dbo.mytable AS d, RUNTIME = ONNX) WITH (Score FLOAT) AS p;

DATA 参数中为表源指定的别名 d 用于引用属于 dbo.mytable 的列。The alias d specified for table source in the DATA parameter is used to reference the columns belonging to dbo.mytable. PREDICT 函数指定的别名 p 用于引用 PREDICT 函数返回的列。The alias p specified for the PREDICT function is used to reference the columns returned by the PREDICT function.

  • 模型存储为“模型”表中的 varbinary(max) 列。The model is stored as varbinary(max) column in table call Models. ID 和说明等其他信息保存在表中以标识模式 。Additional information such as ID and description is saved in the table to identify the mode.
  • DATA 参数中为表源指定的别名 d 用于引用属于 dbo.mytable 的列。The alias d specified for table source in the DATA parameter is used to reference the columns belonging to dbo.mytable. 输入数据列的名称应与模型的输入名称匹配。The input data column names should match the name of inputs for the model.
  • PREDICT 函数指定的别名 p 用于引用 PREDICT 函数返回的预测列。The alias p specified for the PREDICT function is used to reference the predicted column returned by the PREDICT function. 列名应与模型的输出名称相同。The column name should have the same name as the output name for the model.
  • 所有输入数据列和预测列都可显示在 SELECT 语句中。All input data columns and the predicted columns are available to display in the SELECT statement.

可以重写前面的示例查询,通过将 MODEL 指定为标量子查询来创建视图:The preceding example query can be rewritten to create a view by specifying MODEL as a scalar subquery:

CREATE VIEW predictions
AS
SELECT d.*, p.Score
FROM PREDICT(MODEL = (SELECT test_model FROM scoring_model WHERE model_id = 1),
             DATA = dbo.mytable AS d, RUNTIME = ONNX) WITH (Score FLOAT) AS p;

将 PREDICT 与 INSERT 语句相结合Combining PREDICT with an INSERT statement

一个常见的预测用例是生成输入数据的评分,然后将预测值插入到表中。A common use case for prediction is to generate a score for input data, and then insert the predicted values into a table. 下面的示例假定,应用程序调用操作会使用存储过程将包含预测值的行插入到表中:The following example assumes the calling application uses a stored procedure to insert a row containing the predicted value into a table:

DECLARE @model VARBINARY(max) = (SELECT model FROM scoring_model WHERE model_name = 'ScoringModelV1');

INSERT INTO loan_applications (c1, c2, c3, c4, score)
SELECT d.c1, d.c2, d.c3, d.c4, p.score
FROM PREDICT(MODEL = @model, DATA = dbo.mytable AS d) WITH(score FLOAT) AS p;
DECLARE @model VARBINARY(max) = (SELECT model FROM scoring_model WHERE model_name = 'ScoringModelV1');

INSERT INTO loan_applications (c1, c2, c3, c4, score)
SELECT d.c1, d.c2, d.c3, d.c4, p.score
FROM PREDICT(MODEL = @model, DATA = dbo.mytable AS d, RUNTIME = ONNX) WITH(score FLOAT) AS p;
  • PREDICT 的结果存储在名为 PredictionResults 的表中。The results of PREDICT are stored in a table called PredictionResults.
  • 模型存储为“模型”表中的 varbinary(max) 列。The model is stored as varbinary(max) column in table call Models. ID 和说明等其他信息可保存在表中以标识模式。Additional information such as ID and description can be saved in the table to identify the model.
  • DATA 参数中的表源指定的别名 d 用于引用 dbo.mytable 中的列。输入数据列的名称应与模型的输入名称匹配。The alias d specified for table source in the DATA parameter is used to reference the columns in dbo.mytable.The input data column names should match the name of inputs for the model.
  • PREDICT 函数指定的别名 p 用于引用 PREDICT 函数返回的预测列。The alias p specified for the PREDICT function is used to reference the predicted column returned by the PREDICT function. 列名应与模型的输出名称相同。The column name should have the same name as the output name for the model.
  • 所有输入列和预测列都可显示在 SELECT 语句中。All input columns and the predicted column are available to display in the SELECT statement.

后续步骤Next steps