Intermediate Data Mining Tutorial (Analysis Services - Data Mining)
Microsoft Analysis Services provides an integrated environment for creating and working with data mining models. You can easily bind to data sources, create and test multiple models on the same data, and deploy models for use in predictive analysis.
In the Basic Data Mining Tutorial, you learned how to use Business Intelligence Development Studio to create a data mining solution, and you built three models to support a targeted mailing campaign for analyzing customer purchasing behavior and for targeting potential buyers.
To complete the following tutorial, you should to be familiar with the data mining tools and with the mining model viewers that were introduced in the Basic Data Mining Tutorial. This intermediate tutorial builds on that experience and introduces several new scenarios, including forecasting and market basket analysis. You will learn how to create a time series model, an association model, and a sequence clustering model. You will also learn how to use nested tables in a model, and how to create filters on nested tables.
All scenarios use the AdventureWorksDW2008R2 data source, but you will create different data source views for different scenarios. You can do the lessons in any order as long as you create the data source first.
The lessons are independent and can be completed separately.
After your success with the targeted mailing campaign, you have been asked to apply your knowledge of data mining to develop several new models for use in business planning. These include the following new model types:
Time series models, to forecast the sales of products in different regions around the world. You will develop individual models for each region and also a general model that can be used for cross-prediction.
Association model, to analyze groupings of products that are purchased during visits to the Adventure Works Cycles e-commerce site. Based on this market basket model, you might recommend products to customers.
Sequence clustering model, to analyze the order in which customers buy products. Based on this model, you can plan changes in Web site design or new product offerings.
Neural network model and logistic regression models--To perform exploratory analysis of call center data. Based on the insights from the preliminary model, you will create a model to identify possible strategies for improving customer experience with the call center.
What You Will Learn
This tutorial teaches you how to create and work with several types of data mining algorithms. This tutorial also introduces the following concepts:
Using nested tables to build models
Choosing a nested table key, time series key, or sequence key
Filtering nested tables when creating models or making predictions
Determining whether you have enough data to support a model
Creating a general model and applying it to multiple data sets
This tutorial is divided into the following lessons:
Lesson 1: Creating the Intermediate Data Mining Solution (Intermediate Data Mining Tutorial)
In this lesson, you will create a new project based on the AdventureWorksDW2008R2 database, to support several new data sources views and many more mining models.
Lesson 2: Building a Forecasting Scenario (Intermediate Data Mining Tutorial)
In this lesson, you will create a mining model that can be used as part of a forecasting scenario. You will also explore mining models that are built with the Microsoft Time Series algorithm.
You will build models for individual regions, and then build a general model that can be used for cross-prediction.
Lesson 3: Building a Market Basket Scenario (Intermediate Data Mining Tutorial)
In this lesson, you will add a new data source view and learn how to work with nested tables and keys. Based on this data, you will create a mining model that can be used as part of a market basket scenario. You will also explore mining models that are built with the Microsoft Association algorithm.
Lesson 4: Building a Sequence Clustering Scenario (Intermediate Data Mining Tutorial)
In this lesson, you will create a mining model that can be used as part of a sequence clustering scenario. You will also learn how to explore mining models that are built with the Microsoft Sequence Clustering algorithm.
Lesson 5: Building Neural Network and Logistic Regression Models (Intermediate Data Mining Tutorial)
In this lesson, you will create several related mining models, using the Microsoft Neural Network and Microsoft Logistic Regression algorithms. You will also learn to work with data source views to explore data underlying the models.
Make sure that the following are installed:
Microsoft SQL Server 2008 R2
Microsoft SQL Server Analysis Services
SQL Server with the AdventureWorksDW2008R2 database.
By default, the sample databases are not installed, to enhance security. To install the official databases for Microsoft SQL Server, visit the Microsoft SQL Sample Databases page and select SQL Server 2008R2.
When you are working through a tutorial, you might find it easier to move back and forth between the steps if you add the Next topic and Previous topic buttons to the document viewer toolbar. For more information, see Adding Next and Previous Buttons to Help.