关联模型查询示例Association Model Query Examples

适用于: 是SQL Server Analysis Services没有Azure Analysis ServicesAPPLIES TO: yesSQL Server Analysis Services noAzure Analysis Services

在对数据挖掘模型创建查询时,可以创建内容查询,也可创建预测查询。内容查询提供有关在分析过程中发现的规则和项集的详细信息,预测查询使用在数据中发现的关联来做出预测。When you create a query against a data mining model, you can create either a content query, which provides details about the rules and itemsets discovered during analysis, or you can create a prediction query, which uses the associations discovered in the data to make predictions. 对于关联模型来说,预测通常基于规则且可用来给出建议,而内容查询通常用于浏览项集之间的关系。For an association model, predictions typically are based on rules, and can be used to make recommendations, whereas queries on content typically explore the relationship among itemsets. 此外,还可检索有关模型的元数据。You can also retrieve metadata about the model.

本节讲述如何为基于 MicrosoftMicrosoft 关联规则的算法的模型创建这些类型的查询。This section explains how to create these kinds of queries for models that are based on the MicrosoftMicrosoft Association Rules algorithm.

内容查询Content Queries

使用 DMX 获取模型元数据Getting model metadata data by using DMX

从架构行集中获取元数据Getting metadata from the schema rowset

检索模型的原始参数Retrieving the original parameters for the model

检索项集和产品列表Retrieving a list of itemsets and products

返回排在前 10 位的项集Returning the top 10 itemsets

预测查询Prediction Queries

预测关联项Predicting associated items

确定相关项集的置信度Determining confidence for related itemsets

查找有关模型的信息Finding Information about the Model

所有挖掘模型都公开算法根据标准化架构(即挖掘模型架构行集)所了解的内容。All mining models expose the content learned by the algorithm according to a standardized schema, which is named the mining model schema rowset. 可以使用建数据挖掘扩展插件 (DMX) 语句或 Analysis ServicesAnalysis Services 存储过程来对挖掘模型架构行集创建查询。You can create queries against the mining model schema rowset either by using Data Mining Extensions (DMX) statements, or by using Analysis ServicesAnalysis Services stored procedures. SQL Server 2017SQL Server 2017中,还可使用类似 SQL 的语法,直接将架构行集作为系统表来查询。In SQL Server 2017SQL Server 2017, you can also query the schema rowsets directly as system tables, by using a SQL-like syntax.

示例查询 1:使用 DMX 获取模型元数据Sample Query 1: Getting Model Metadata by Using DMX

以下查询返回有关关联模型 Association的基本元数据,例如模型名称、存储模型的数据库以及模型中子节点的数目。The following query returns basic metadata about the association model, Association, such as the name of the model, the database where the model is stored, and the number of child nodes in the model. 此查询使用 DMX 内容查询从模型的父节点中检索元数据:This query uses a DMX content query to retrieve the metadata from the parent node of the model:

SELECT MODEL_CATALOG, MODEL_NAME, NODE_CAPTION,   
NODE_SUPPORT, [CHILDREN_CARDINALITY], NODE_DESCRIPTION  
FROM Association.CONTENT  
WHERE NODE_TYPE = 1  

备注

必须将列名 CHILDREN_CARDINALITY 括在括号中,以便将它与同名的 MDX 保留关键字区分开来。You must enclose the name of the column, CHILDREN_CARDINALITY, in brackets to distinguish it from the MDX reserved keyword of the same name.

示例结果:Example results:

MODEL_CATALOGMODEL_CATALOG Association TestAssociation Test
MODEL_NAMEMODEL_NAME 关联Association
NODE_CAPTIONNODE_CAPTION Association Rules ModelAssociation Rules Model
NODE_SUPPORTNODE_SUPPORT 1487914879
CHILDREN_CARDINALITYCHILDREN_CARDINALITY 942942
NODE_DESCRIPTIONNODE_DESCRIPTION Association Rules Model; ITEMSET_COUNT=679; RULE_COUNT=263; MIN_SUPPORT=14; MAX_SUPPORT=4334; MIN_ITEMSET_SIZE=0; MAX_ITEMSET_SIZE=3; MIN_PROBABILITY=0.400390625; MAX_PROBABILITY=1; MIN_LIFT=0.14309369632511; MAX_LIFT=1.95758227647523Association Rules Model; ITEMSET_COUNT=679; RULE_COUNT=263; MIN_SUPPORT=14; MAX_SUPPORT=4334; MIN_ITEMSET_SIZE=0; MAX_ITEMSET_SIZE=3; MIN_PROBABILITY=0.400390625; MAX_PROBABILITY=1; MIN_LIFT=0.14309369632511; MAX_LIFT=1.95758227647523

有关这些列在关联模型中含义的定义,请参阅 关联模型的挖掘模型内容(Analysis Services - 数据挖掘)For a definition of what these columns mean in an association model, see Mining Model Content for Association Models (Analysis Services - Data Mining).

返回页首Return to Top

示例查询 2:从架构行集中获取其他元数据Sample Query 2: Getting Additional Metadata from the Schema Rowset

通过查询数据挖掘架构行集,可以找到在 DMX 内容查询中返回的相同信息。By querying the data mining schema rowset, you can find the same information that is returned in a DMX content query. 不过,架构行集还提供其他一些列,例如上次处理模型的日期、挖掘结构和用作可预测属性的列的名称。However, the schema rowset provides some additional columns, such as the date the model was last processed, the mining structure, and the name of the column used as the predictable attribute.

SELECT MODEL_CATALOG, MODEL_NAME, SERVICE_NAME, PREDICTION_ENTITY,   
MINING_STRUCTURE, LAST_PROCESSED  
FROM $system.DMSCHEMA_MINING_MODELS  
WHERE MODEL_NAME = 'Association'  

示例结果:Example results:

MODEL_CATALOGMODEL_CATALOG Adventure Works DW Multidimensional 2012Adventure Works DW Multidimensional 2012
MODEL_NAMEMODEL_NAME 关联Association
SERVICE_NAMESERVICE_NAME Association Rules ModelAssociation Rules Model
PREDICTION_ENTITYPREDICTION_ENTITY v Assoc Seq Line Itemsv Assoc Seq Line Items
MINING_STRUCTUREMINING_STRUCTURE 关联Association
LAST_PROCESSEDLAST_PROCESSED 9/29/2007 10:21:24 PM9/29/2007 10:21:24 PM

返回页首Return to Top

示例查询 3:检索模型的原始参数Sample Query 3: Retrieving Original Parameters for Model

以下查询返回一个列,该列包含有关创建模型时使用的参数设置的详细信息。The following query returns a single column that contains details about the parameter settings that were used when the model was created.

SELECT MINING_PARAMETERS   
from $system.DMSCHEMA_MINING_MODELS  
WHERE MODEL_NAME = 'Association'  

示例结果:Example results:

MAXIMUM_ITEMSET_COUNT=200000,MAXIMUM_ITEMSET_SIZE=3,MAXIMUM_SUPPORT=1,MINIMUM_SUPPORT=9.40923449156529E-04,MINIMUM_IMPORTANCE=-999999999,MINIMUM_ITEMSET_SIZE=0,MINIMUM_PROBABILITY=0.4MAXIMUM_ITEMSET_COUNT=200000,MAXIMUM_ITEMSET_SIZE=3,MAXIMUM_SUPPORT=1,MINIMUM_SUPPORT=9.40923449156529E-04,MINIMUM_IMPORTANCE=-999999999,MINIMUM_ITEMSET_SIZE=0,MINIMUM_PROBABILITY=0.4

返回页首Return to Top

查找有关规则和项集的信息Finding Information about Rules and Itemsets

关联模型有两个常见用途:查找有关常见项集的信息以及提取有关特定规则和项集的详细信息。There are two common uses of an association model: to discover information about frequent itemsets, and to extract details about particular rules and itemsets. 例如,您可能希望提取评为当前特别受关注的规则的列表,或创建最常见项集的列表。For example, you might want to extract a list of rules that were scored as being especially interesting, or create a list of the most common itemsets. 您可以使用 DMX 内容查询来检索此类信息,You retrieve such information by using a DMX content query. 也可使用 “Microsoft 关联查看器” 浏览该信息。You can also browse this information by using the Microsoft Association Viewer.

示例查询 4:检索项集和产品列表Sample Query 4: Retrieving List of Itemsets and Products

以下查询检索全部项集,同时还将检索列出每个项集中包含的产品的嵌套表。The following query retrieves all of the itemsets, together with a nested table that lists the products included in each itemset. NODE_NAME 列包含模型内项集的唯一 ID,而 NODE_CAPTION 给出项目的文本说明。The NODE_NAME column contains the unique ID of the itemset within the model, whereas the NODE_CAPTION provides a text description of the items. 本例中对嵌套表进行了平展处理,这样,包含两个产品的项集在结果中生成了两行。In this example, the nested table is flattened, so that an itemset that contains two products generates two rows in the results. 如果客户端支持分层数据,则可以忽略 FLATTENED 关键字。You can omit the FLATTENED keyword if your client supports hierarchical data.

SELECT FLATTENED NODE_NAME, NODE_CAPTION,  
NODE_PROBABILITY, NODE_SUPPORT,  
(SELECT ATTRIBUTE_NAME FROM NODE_DISTRIBUTION) as PurchasedProducts  
FROM Association.CONTENT  
WHERE NODE_TYPE = 7  

示例结果:Example results:

NODE_NAMENODE_NAME 3737
NODE_CAPTIONNODE_CAPTION Sport-100 = ExistingSport-100 = Existing
NODE_PROBABILITYNODE_PROBABILITY 0.2912830163317430.291283016331743
NODE_SUPPORTNODE_SUPPORT 43344334
PURCHASEDPRODUCTS.ATTRIBUTE_NAMEPURCHASEDPRODUCTS.ATTRIBUTE_NAME v Assoc Seq Line Items(Sport-100)v Assoc Seq Line Items(Sport-100)

返回页首Return to Top

示例查询 5:返回排在前 10 位的项集Sample Query 5: Returning Top 10 Itemsets

本例演示如何使用 DMX 在默认情况下提供的某些分组和排序函数。This example demonstrates how to use some of the grouping and ordering functions that DMX provides by default. 当按照每个节点的支持对项集排序时,该查询返回排在前 10 位的项集。The query returns the top 10 itemsets when ordered by the support for each node. 请注意,无需对结果进行显式分组,这与 Transact-SQL 中不同。不过,只能在每个查询中使用一个聚合函数。Note that you do not need to explicitly group the results, as you would in Transact-SQL; however, you can use only one aggregate function in each query.

SELECT TOP 10 (NODE_SUPPORT),NODE_NAME, NODE_CAPTION  
FROM Association.CONTENT  
WHERE NODE_TYPE = 7  

示例结果:Example results:

NODE_SUPPORTNODE_SUPPORT 43344334
NODE_NAMENODE_NAME 3737
NODE_CAPTIONNODE_CAPTION Sport-100 = ExistingSport-100 = Existing

返回页首Return to Top

使用模型进行预测Making Predictions using the Model

关联规则模型通常用于生成建议,这些建议基于在项集中发现的相关性。An association rules model is often used to generate recommendations, which are based on correlations discovered in the itemsets. 因此,基于关联规则模型创建预测查询时,您通常使用模型中的规则基于新数据进行猜测。Therefore, when you create a prediction query based on an association rules model, you are typically using the rules in the model to make guesses based on new data. PredictAssociation (DMX) 是返回建议的函数,它包含几个可用于自定义查询结果的参数。PredictAssociation (DMX) is the function that returns recommendations, and has several arguments that you can use to customize the query results.

另一个说明对关联模型的查询非常有用的示例是返回各种规则和项集的置信度,以便可以比较不同跨区销售策略的有效性。Another example of where queries on an association model might be useful is to return the confidence for various rules and itemsets so that you can compare the effectiveness of different cross-sell strategies. 以下示例说明如何创建这些查询。The following examples illustrate how to create such queries.

示例查询 6:预测关联项目Sample Query 6: Predicting Associated Items

此示例使用在数据挖掘中级教程(Analysis Services - 数据挖掘)中创建的关联模型。This example uses the Association model created in the Intermediate Data Mining Tutorial (Analysis Services - Data Mining). 它演示如何创建一个预测查询,该查询告诉您应向已购买某种特定产品的客户推荐哪些产品。It demonstrates how to create a prediction query that tells you what products to recommend to a customer who has purchased a particular product. 此种类型的查询称为单独查询,在该查询中,使用 SELECT…UNION 语句向模型提供所需的值。This type of query, where you provide values to the model in a SELECT…UNION statement, is called a singleton query. 由于对应于新值的可预测模型列为嵌套表,因此必须使用某一 SELECT 子句将新值映射到嵌套表列 [Model],再使用另一 SELECT 子句将嵌套表列映射到事例级别列 [v Assoc Seq Line Items]Because the predictable model column that corresponds to the new values is a nested table, you must use one SELECT clause to map the new value to the nested table column, [Model], and another SELECT clause to map the nested table column to the case-level column, [v Assoc Seq Line Items]. 如果在该查询中添加 INCLUDE-STATISTICS 关键字,则可看到推荐的概率和支持。Adding the keyword INCLUDE-STATISTICS to the query lets you see the probability and support for the recommendations.

SELECT PredictAssociation([Association].[vAssocSeqLineItems],INCLUDE_STATISTICS, 3)  
FROM [Association]  
NATURAL PREDICTION JOIN   
(SELECT  
(SELECT 'Classic Vest' as [Model])  
AS [v Assoc Seq Line Items])  
AS t  

示例结果:Example results:

ModelModel $SUPPORT$SUPPORT $PROBABILITY$PROBABILITY $ADJUSTEDPROBABILITY$ADJUSTEDPROBABILITY
Sport-100Sport-100 43344334 0.2912830.291283 0.2526960.252696
Water BottleWater Bottle 28662866 0.192620.19262 0.1752050.175205
Patch kitPatch kit 21132113 0.1420120.142012 0.1323890.132389

返回页首Return to Top

示例查询 7:确定相关项集的置信度Sample Query 7: Determining Confidence for Related Itemsets

尽管规则可用于生成建议,但在对数据集内的模式的更深入分析中,项集作用更大。Whereas rules are useful for generating recommendations, itemsets are more interesting for deeper analysis of the patterns in the data set. 例如,如果对前面示例查询返回的建议不满意,则可以检查包含产品 A 的其他项集,以更好地了解是否产品 A 是人们在购买各种产品时倾向于购买的附件,或者是否产品 A 与购买特定产品有很强的关联性。For example, if you were not satisfied with the recommendation that are returned by the previous sample query, you could examine other itemsets that contain Product A, to can get a better idea of whether Product A is an accessory that people tend to buy with all kinds of products, or whether A is strongly correlated with purchases of particular products. 浏览这些关系的最简单方法是在 MicrosoftMicrosoft 关联查看器中筛选项集,但也可使用查询检索这些信息。The easiest way to explore these relationships is by filtering the itemsets in the MicrosoftMicrosoft Association Viewer; however, you can retrieve the same information with a query.

以下示例查询返回包含 Water Bottle 项目(包括单项 Water bottle)的所有项集。The following sample query returns all itemsets that include the Water Bottle item, including the single item Water bottle.

SELECT TOP 100 FROM   
(  
SELECT FLATTENED NODE_CAPTION, NODE_SUPPORT,   
(SELECT ATTRIBUTE_NAME from NODE_DISTRIBUTION  
WHERE ATTRIBUTE_NAME = 'v Assoc Seq Line Items(Water Bottle)') as D  
FROM Association.CONTENT  
WHERE NODE_TYPE = 7  
) AS Items  
WHERE [D.ATTRIBUTE_NAME] <> NULL  
ORDER BY NODE_SUPPORT DESC  

示例结果:Example results:

NODE_CAPTIONNODE_CAPTION NODE_SUPPORTNODE_SUPPORT D.ATTRIBUTE_NAMED.ATTRIBUTE_NAME
Water Bottle = ExistingWater Bottle = Existing 28662866 v Assoc Seq Line Items(Water Bottle)v Assoc Seq Line Items(Water Bottle)
Mountain Bottle Cage = Existing, Water Bottle = ExistingMountain Bottle Cage = Existing, Water Bottle = Existing 11361136 v Assoc Seq Line Items(Water Bottle)v Assoc Seq Line Items(Water Bottle)
Road Bottle Cage = Existing, Water Bottle = ExistingRoad Bottle Cage = Existing, Water Bottle = Existing 10681068 v Assoc Seq Line Items(Water Bottle)v Assoc Seq Line Items(Water Bottle)
Water Bottle = Existing, Sport-100 = ExistingWater Bottle = Existing, Sport-100 = Existing 734734 v Assoc Seq Line Items(Water Bottle)v Assoc Seq Line Items(Water Bottle)

该查询不仅返回嵌套表中符合此条件的行,还返回嵌套表外部或事例表中的所有行。This query returns both the rows from the nested table that match the criteria, and all the rows from the outside or case table. 因此,必须添加一个条件来消除对目标属性名称具有 Null 值的事例表行。Therefore, you must add a condition that eliminates the case table rows that have a null value for the target attribute name.

返回页首Return to Top

函数列表Function List

所有 MicrosoftMicrosoft 算法均支持一组通用的函数。All MicrosoftMicrosoft algorithms support a common set of functions. MicrosoftMicrosoft 关联算法还支持下表中列出的函数。However, the MicrosoftMicrosoft Association algorithm supports the additional functions listed in the following table.

预测函数Prediction Function 用法Usage
IsDescendant & #40; DMX & #41;IsDescendant (DMX) 确定一个节点是否是神经网络图中另一个节点的子节点。Determines whether one node is a child of another node in the neural network graph.
IsInNode & #40; DMX & #41;IsInNode (DMX) 指示指定的节点是否包含当前事例。Indicates whether the specified node contains the current case.
PredictAdjustedProbability & #40; DMX & #41;PredictAdjustedProbability (DMX) 返回加权的概率。Returns the weighted probability.
PredictAssociation & #40; DMX & #41;PredictAssociation (DMX) 预测关联数据集中的成员身份。Predicts membership in an associative dataset.
PredictHistogram & #40; DMX & #41;PredictHistogram (DMX) 返回与当前预测值相关的值的表。Returns a table of values related to the current predicted value.
PredictNodeId & #40; DMX & #41;PredictNodeId (DMX) 返回每个事例的 Node_ID。Returns the Node_ID for each case.
PredictProbability & #40; DMX & #41;PredictProbability (DMX) 返回预测值的概率。Returns probability for the predicted value.
PredictSupport & #40; DMX & #41;PredictSupport (DMX) 返回指定状态的支持值。Returns the support value for a specified state.
PredictVariance & #40; DMX & #41;PredictVariance (DMX) 返回预测值的方差。Returns variance for the predicted value.

另请参阅See Also

Microsoft 关联算法 Microsoft Association Algorithm
Microsoft 关联算法技术参考 Microsoft Association Algorithm Technical Reference
关联模型 & #40; 的挖掘模型内容Analysis Services-数据挖掘 & #41;Mining Model Content for Association Models (Analysis Services - Data Mining)