Mining Structures (Analysis Services)
There are several objects involved in data mining in Microsoft SQL Server 2005 Analysis Services (SSAS). Following are the two primary objects that are used:
- Data mining structure
- Data mining model
Data Mining Structure
The mining structure is a data structure that defines the data domain from which mining models are built. A single mining structure can contain multiple mining models that share the same domain.
The building blocks of the mining structure are the mining structure columns, which describe the data that the data source contains. These columns contain information such as data type, content type, and how the data is distributed.
A mining structure can also contain nested tables. A nested table represents a one-to-many relationship between the entity of a case and its related attributes. For example, if the information that describes the customer resides in one table, and the customer's purchases reside in another table, you can use nested tables to combine the information into a single case. The customer identifier is the entity, and the purchases are the related attributes. For more information about when to use nested tables, see Nested Tables.
The mining structure does not contain information about how columns are used for a specific mining model, or about the type of algorithm that is used to build a model; this information is defined in the mining model itself.
Data Mining Model
A data mining model applies a mining model algorithm to the data that is represented by a mining structure. Like the mining structure, the mining model contains columns. A mining model is contained within the mining structure, and inherits all the values of the properties that are defined by the mining structure. The model can use all the columns that the mining structure contains or a subset of the columns.
In addition to the parameters that are defined on the mining structure, the mining model contains two properties: Algorithm and Usage. The algorithm parameter is defined on the mining model, and the usage parameter is defined on the mining model column. These parameters are described in the following table.
A model property that defines the algorithm that is used to create the model.
A model column property that defines how a column is used by the model. You can define columns to be input columns, key columns, or predictable columns.
A data mining model is just an empty object until it is processed. When you process a model, the data that is defined by the structure is passed through the algorithm. The algorithm identifies rules and patterns within the data, and then uses these rules and patterns to populate the model. For more information about how algorithms are used to create mining models, see Data Mining Algorithms.
After you have processed a model, you can explore it by using the custom viewers that are provided in Business Intelligence Development Studio and SQL Server Management Studio, or by querying the model to perform predictions. For more information about the custom viewers in Analysis Services, see Viewing a Data Mining Model.
You can create multiple models that are based on the same structure. All models built from the same structure must be from the same data source. However, the models can differ as to which columns of the structure are used, how the columns are used, the type of algorithm that is used to create each model, and the parameter settings for each algorithm. For example, you can build separate decision tree and clustering models, each containing different columns from the structure and being used to accomplish different business tasks.