# SsaSpikeEstimator Class

## Definition

Detect spikes in time series using Singular Spectrum Analysis.

public sealed class SsaSpikeEstimator : Microsoft.ML.IEstimator<Microsoft.ML.Transforms.TimeSeries.SsaSpikeDetector>
type SsaSpikeEstimator = class
interface IEstimator<SsaSpikeDetector>
Public NotInheritable Class SsaSpikeEstimator
Implements IEstimator(Of SsaSpikeDetector)
Inheritance
SsaSpikeEstimator
Implements

## Remarks

To create this estimator, use DetectSpikeBySsa

### Input and Output Columns

There is only one input column. The input column must be Single where a Single value indicates a value at a timestamp in the time series.

It produces a column that is a vector with 3 elements. The output vector sequentially contains alert level (non-zero value means a change point), score, and p-value.

### Estimator Characteristics

Does this estimator need to look at the data to train its parameters? Yes
Input column data type Single
Output column data type 3-element vector of Double
Exportable to ONNX No

### Estimator Characteristics

Is normalization required? No
Is caching required? No
Required NuGet in addition to Microsoft.ML Microsoft.ML.TimeSeries

### Training Algorithm Details

This class implements the general anomaly detection transform based on Singular Spectrum Analysis (SSA). SSA is a powerful framework for decomposing the time-series into trend, seasonality and noise components as well as forecasting the future values of the time-series. In principle, SSA performs spectral analysis on the input time-series where each component in the spectrum corresponds to a trend, seasonal or noise component in the time-series. For details of the Singular Spectrum Analysis (SSA), refer to this document.

### Anomaly Scorer

Once the raw score at a timestamp is computed, it is fed to the anomaly scorer component to calculate the final anomaly score at that timestamp.

#### Spike detection based on p-value

The p-value score indicates whether the current point is an outlier (also known as a spike). The lower its value, the more likely it is a spike. The p-value score is always in $[0, 1]$.

This score is the p-value of the current computed raw score according to a distribution of raw scores. Here, the distribution is estimated based on the most recent raw score values up to certain depth back in the history. More specifically, this distribution is estimated using kernel density estimation with the Gaussian kernels of adaptive bandwidth.

If the p-value score exceeds $1 - \frac{\text{confidence}}{100}$, the associated timestamp may get a non-zero alert value in spike detection, which means a spike point is detected. Note that $\text{confidence}$ is defined in the signatures of DetectIidSpike and DetectSpikeBySsa.