Sample 5 - Classification: Predict churn, appetency, and up-selling

Learn how to build a complex machine learning experiment without writing a single line of code using the visual interface.

This experiment trains three, two-class boosted decision tree classifiers to predict common tasks for customer relationship management (CRM) systems: churn, appetency, and up-selling. The data values and labels are split across multiple data sources and scrambled to anonymize customer information, however, we can still use the visual interface to combine data sets and train a model using the scrambled values.

Because you're trying to answer the question "Which one?" this is called a classification problem, but you can apply the same logic in this project to tackle any type of machine learning problem whether it be regression, classification, clustering, and so on.

Here's the completed graph for this experiment:

Experiment graph

Prerequisites

  1. Create an Azure Machine Learning service workspace if you don't have one.

  2. Open your workspace in the Azure portal. If you're not sure how to locate your workspace in the portal, see how to find your workspace.

  3. In your workspace, select Visual interface. Then select Launch visual interface.

    Launch visual interface

    The interface webpage opens in a new browser page.

You can also access the visual interface from your workspace landing page (preview).

  1. Select the Open button for the Sample 5 experiment.

    Open the experiment

Data

The data for this experiment is from KDD Cup 2009. It has 50,000 rows and 230 feature columns. The task is to predict churn, appetency, and up-selling for customers who use these features. For more information about the data and the task, see the KDD website.

Experiment summary

This visual interface sample experiment shows binary classifier prediction of churn, appetency, and up-selling, a common task for customer relationship management (CRM).

First, do some simple data processing.

  • The raw dataset contains lots of missing values. Use the Clean Missing Data module to replace the missing values with 0.

    Clean the dataset

  • The features and the corresponding churn, appetency, and up-selling labels are in different datasets. Use the Add Columns module to append the label columns to the feature columns. The first column, Col1, is the label column. The rest of the columns, Var1, Var2, and so on, are the feature columns.

    Add the column dataset

  • Use the Split Data module to split the dataset into train and test sets.

    Then use the Boosted Decision Tree binary classifier with the default parameters to build the prediction models. Build one model per task, that is, one model each to predict up-selling, appetency, and churn.

Results

Visualize the output of the Evaluate Model module to see the performance of the model on the test set. For the up-selling task, the ROC curve shows that the model does better than a random model. The area under the curve (AUC) is 0.857. At threshold 0.5, the precision is 0.7, the recall is 0.463, and the F1 score is 0.545.

Evaluate the results

You can move the Threshold slider and see the metrics change for the binary classification task.

Clean up resources

Important

You can use the resources that you created as prerequisites for other Azure Machine Learning service tutorials and how-to articles.

Delete everything

If you don't plan to use anything that you created, delete the entire resource group so you don't incur any charges:

  1. In the Azure portal, select Resource groups on the left side of the window.

    Delete resource group in the Azure portal

  2. In the list, select the resource group that you created.

  3. On the right side of the window, select the ellipsis button (...).

  4. Select Delete resource group.

Deleting the resource group also deletes all resources that you created in the visual interface.

Delete only the compute target

The compute target that you created here automatically autoscales to zero nodes when it's not being used. This is to minimize charges. If you want to delete the compute target, take these steps:

  1. In the Azure portal, open your workspace.

    Delete the compute target

  2. In the Compute section of your workspace, select the resource.

  3. Select Delete.

Delete individual assets

In the visual interface where you created your experiment, delete individual assets by selecting them and then selecting the Delete button.

Delete experiments

Next steps

Explore the other samples available for the visual interface: