series_outliers()

Scores anomaly points in a series.

The function takes an expression with a dynamic numerical array as input, and generates a dynamic numeric array of the same length. Each value of the array indicates a score of a possible anomaly, using "Tukey's test". A value greater than 1.5 in the same element of the input indicates a rise or decline anomaly. A value less than -1.5, indicates a decline anomaly.

Syntax

series_outliers(x, kind, ignore_val, min_percentile, max_percentile)

Arguments

  • x: Dynamic array cell that is an array of numeric values
  • kind: Algorithm of outlier detection. Currently supports "tukey" (traditional "Tukey") and "ctukey" (custom "Tukey"). Default is "ctukey"
  • ignore_val: Numeric value indicating missing values in the series. Default is double(null). The score of nulls and ignore values is set to 0
  • min_percentile: For calculating the normal inter-quantile range. Default is 10, custom values supported are in range [2.0, 98.0] (ctukey only)
  • max_percentile: same, default is 90, custom values supported are in range [2.0, 98.0] (ctukey only)

The following table describes differences between "tukey" and "ctukey":

Algorithm Default quantile range Supports custom quantile range
"tukey" 25% / 75% No
"ctukey" 10% / 90% Yes

Tip

The best way to use this function is to apply it to the results of the make-series operator.

Example

A time series with some noise creates outliers. If you would like to replace those outliers (noise) with the average value, use series_outliers() to detect the outliers, and then replace them.

range x from 1 to 100 step 1 
| extend y=iff(x==20 or x==80, 10*rand()+10+(50-x)/2, 10*rand()+10) // generate a sample series with outliers at x=20 and x=80
| summarize x=make_list(x),series=make_list(y)
| extend series_stats(series), outliers=series_outliers(series)
| mv-expand x to typeof(long), series to typeof(double), outliers to typeof(double)
| project x, series , outliers_removed=iff(outliers > 1.5 or outliers < -1.5, series_stats_series_avg , series ) // replace outliers with the average
| render linechart

Series outliers