This article introduces the modules provided in Azure Machine Learning Studio for anomaly detection. Anomaly detection encompasses many important tasks in machine learning:
- Identifying transactions that are potentially fraudulent.
- Learning patterns that indicate that a network intrusion has occurred.
- Finding abnormal clusters of patients.
- Checking values entered into a system.
Because anomalies are rare events by definition, it can be difficult to collect a representative sample of data to use for modeling. The algorithms included in this category have been especially designed to address the core challenges of building and training models by using imbalanced data sets.
Anomaly detection modules
Machine Learning Studio provides the following modules that you can use to create an anomaly detection model. Just drag the module into your experiment to begin working with the model.
After setting model parameters, you must train the model by using a labeled data set and the Train Anomaly Detection Model training module. The result is a trained model that you can use to test new data. To do this, use the all-purpose Score Model module.
For an example of how these modules work together, see the Anomaly Detection: Credit Risk experiment in the Cortana Intelligence Gallery.
Time Series Anomaly Detection is a new module that's a bit different from the other anomaly detection models. The Time Series Anomaly Detection module is designed for time series data. It's intended to use to analyze trends over time. The algorithm identifies potentially anomalous trends in the time series data. It flags deviations from the trend's direction or magnitude.
Azure also provides the Machine Learning Anomaly Detection API, which you can call as a web service.
If you're not sure whether anomaly detection is the right algorithm to use with your data, see these guides:
- Machine learning algorithm cheat sheet for Azure Machine Learning provides a graphical decision chart to guide you through the selection process.
- Choose Azure Machine Learning algorithms for clustering, classification, or regression.
List of modules
The Anomaly Detection category includes the following modules:
- One-Class Support Vector Machine: Creates a one-class support vector machine model for anomaly detection.
- PCA-Based Anomaly Detection: Creates an anomaly detection model by using Principal Component Analysis.