Analyze Key Influencers Video Tutorial (Data Mining Table Analysis Tool)

Video Summary

In this tutorial we will learn how to use the Analyze Key Influencers Table Analysis Tool for Excel 2007.

Video Transcript


Hi, my name is Mary Brennan. I’m a technical writer for Microsoft SQL Server.

The Analyze Key Influencers tool uses the Microsoft Clustering algorithm to enable you to select a column that contains a desired outcome or target value, and then analyze the patterns in your data to determine which factors had the strongest influence on that outcome. For example, if you have a customer list that includes a column that shows the total purchases for each customer over the past year, you could analyze the table to determine the common characteristics of your top purchasers.

The tool also enables you to select a pair of possible outcomes and compare them. For example, if the column that you are predicting contains discrete values such as Purchase = Yes, Purchase = No, or Purchase = Yes but not now, you can compare the customers who purchase immediately against those who purchase later, and those who purchase later against those who never purchase. This helps determine the important factors that influence purchasing.

In this tutorial we’ll discover the factors that influence the decision to purchase a bike. We can then use this information to identify customers who are most likely to purchase our products.

The Wizard

  1. To begin, select the Table Analysis Tools Sample tab and click anywhere inside the table to activate the Table Analysis Tools.

  2. Under the Table Tools menu select the Analyze tab to open the Table Analysis Tools ribbon.

  3. Double-click Analyze Key Influencers to launch the wizard.

  4. In the wizard dialog, select the column to analyze. The wizard automatically defaults to the first column in the worksheet but we want to look at bike purchase. If we click Run now, the analysis will be performed on all of the columns. Instead, we will…

  5. Click Choose columns to be used for analysis. In the Advanced Columns Selection dialog box, choose the columns that are most likely to influence bike sales. We’d like to find out how Marital Status, Number of Cars Owned, Region, and Gender affect bike sales. By streamlining our analysis we improve performance and accuracy.

  6. Click Run. If patterns are detected, the wizard creates a report on a new worksheet that shows those patterns. The report is named Key Influencers for <Purchased Bike>.If no patterns are detected, the wizard creates a new worksheet that contains a description of the problem.

The Reports

Open the Key Influencers for Purchased Bike report. The table reports the factors associated with each outcome and graphically displays the probability of the relationship.

  1. Looking at our results, the first thing I notice is that Gender is not listed which means it doesn’t have a significant influence on bike buying.

  2. The results are listed from most influential to least influential for each possible Purchased Bike value. The most significant influencer for NOT purchasing a bike is having 2 cars. The most significant influencer for purchasing a bike is having zero cars.

  3. Now let’s customize this report even more and explore the results in more depth. In the Discrimination based on key influencers dialog, we can create a new report that compares each of the possible outcomes. What we are actually doing is filtering the results and drilling down into the model to discover the specific attributes and to highlight differences between two key influencers. In this case we only have two – Yes and No.

  4. Click Add Report.

    The wizard creates a new table for each pair of factor comparisons. Because we only have 2 values, this table is a different view of the same data.

  5. When you are done making comparisons, click Close.

In this tutorial we looked at a column that had only 2 values – yes and no. In more complex examples, you can analyze influencers on an outcome that has many values. For example, Number of Bikes Purchased could have values of: 0, 1, 2, 3, 4, and so on. Or, using the same database we could look at key influencers for Region and then drill down and compare Pacific and North America.


This concludes the Analyze Key Influencers video tutorial. For additional help with the Table Analysis Tools, I recommend viewing the other Table Analysis Tools video tutorials and the Help documentation included with the Data Mining Add-ins for Excel. Thank you for viewing this tutorial.