Forecasting Anomaly Model

A ForecastingAnomalyModel wraps around a Darts forecasting model and one or several anomaly scorer(s) to compute anomaly scores by comparing how actuals deviate from the model’s forecasts.

class darts.ad.anomaly_model.forecasting_am.ForecastingAnomalyModel(model, scorer)[source]

Bases: AnomalyModel

Forecasting-based Anomaly Detection Model

The forecasting model must be a GlobalForecastingModel that may or may not be already fitted. The underlying assumption is that model should be able to accurately forecast the series in the absence of anomalies. For this reason, it is recommended to either provide a model that has already been fitted and evaluated to work appropriately on a series without anomalies, or to ensure that a simple call to the fit() method of the model will be sufficient to train it to satisfactory performance on a series without anomalies. The pre-trained model will be used to generate forecasts when calling score().

Calling fit() on the anomaly model will fit the underlying forecasting model only if allow_model_training is set to True upon calling fit(). In addition, calling fit() will also fit the fittable scorers, if any.

Parameters
  • model (GlobalForecastingModel) – An instance of a Darts forecasting model.

  • scorer (Union[AnomalyScorer, Sequence[AnomalyScorer]]) – One or multiple scorer(s) that will be used to compare the actual and predicted time series in order to obtain an anomaly score TimeSeries. If a list of N scorers is given, the anomaly model will call each one of the scorers and output a list of N anomaly scores TimeSeries.

Attributes

scorers_are_trainable

Whether any of the Scorers is trainable.

scorers_are_univariate

Whether any of the Scorers is univariate.

Methods

eval_metric(anomalies, series[, ...])

Compute the accuracy of the anomaly scores computed by the model.

fit(series[, past_covariates, ...])

Fit the underlying forecasting model (if applicable) and the fittable scorers, if any.

predict_series(series[, past_covariates, ...])

Computes the historical forecasts that would have been obtained by the underlying forecasting model on series.

score(series[, past_covariates, ...])

Compute anomaly score(s) for the given series.

show_anomalies(series[, past_covariates, ...])

Plot the results of the anomaly model.

eval_metric(anomalies, series, past_covariates=None, future_covariates=None, forecast_horizon=1, start=None, start_format='value', num_samples=1, verbose=False, show_warnings=True, enable_optimization=True, metric='AUC_ROC')[source]

Compute the accuracy of the anomaly scores computed by the model.

Predicts the series with the forecasting model, and applies the scorer(s) on the predicted time series and the given target time series. Returns the score(s) of an agnostic threshold metric, based on the anomaly score given by the scorer(s).

Parameters
  • anomalies (Union[TimeSeries, Sequence[TimeSeries]]) – The (sequence of) ground truth binary anomaly series (1 if it is an anomaly and 0 if not).

  • series (Union[TimeSeries, Sequence[TimeSeries]]) – The (sequence of) series to predict anomalies on.

  • past_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a (sequence of) past-observed covariate series or sequence of series. This applies only to models that support past covariates.

  • future_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a (sequence of) future-known covariate series or sequence of series. This applies only to models that support future covariates.

  • forecast_horizon (int) – The forecast horizon for the predictions.

  • start (Union[Timestamp, float, int, None]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types: float, int and pandas.Timestamp. In the case of float, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case of int, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case of pandas.Timestamp, this time stamp will be used to determine the first prediction time directly.

  • start_format (Literal[‘position’, ‘value’]) – Defines the start format. Only effective when start is an integer and series is indexed with a pd.RangeIndex. If set to ‘position’, start corresponds to the index position of the first predicted point and can range from (-len(series), len(series) - 1). If set to ‘value’, start corresponds to the index value/label of the first predicted point. Will raise an error if the value is not in series’ index. Default: ‘value’

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • verbose (bool) – Whether to print progress.

  • show_warnings (bool) – Whether to show warnings related to historical forecasts optimization, or parameters start and train_length.

  • enable_optimization (bool) – Whether to use the optimized version of historical_forecasts when supported and available. Default: True.

  • metric (Literal[‘AUC_ROC’, ‘AUC_PR’]) – The name of the metric function to use. Must be one of “AUC_ROC” (Area Under the Receiver Operating Characteristic Curve) and “AUC_PR” (Average Precision from scores). Default: “AUC_ROC”.

Return type

Union[Dict[str, float], Dict[str, Sequence[float]], Sequence[Dict[str, float]], Sequence[Dict[str, Sequence[float]]]]

Returns

  • Dict[str, float] – A dictionary with the resulting metrics for single univariate series, with keys representing the anomaly scorer(s), and values representing the metric values.

  • Dict[str, Sequence[float]] – Same as for Dict[str, float] but for multivariate series, and anomaly scorers that treat series components/columns independently (by nature of the scorer or if component_wise=True).

  • Sequence[Dict[str, float]] – Same as for Dict[str, float] but for a sequence of univariate series.

  • Sequence[Dict[str, Sequence[float]]] – Same as for Dict[str, float] but for a sequence of multivariate series.

fit(series, past_covariates=None, future_covariates=None, allow_model_training=False, forecast_horizon=1, start=None, start_format='value', num_samples=1, verbose=False, show_warnings=True, enable_optimization=True, **model_fit_kwargs)[source]

Fit the underlying forecasting model (if applicable) and the fittable scorers, if any.

Train the forecasting model (if not already fitted and allow_model_training is True) and the fittable scorer(s) on the given time series.

We use the trained forecasting model to compute historical forecasts for the input series. The scorer(s) are then trained on these forecasts along with the input series.

Parameters
  • series (Union[TimeSeries, Sequence[TimeSeries]]) – The (sequence of) series to train on (generally assumed to be anomaly-free).

  • past_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a (sequence of) past-observed covariate series or sequence of series. This applies only to models that support past covariates.

  • future_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a (sequence of) future-known covariate series or sequence of series. This applies only to models that support future covariates.

  • allow_model_training (bool) – Whether the forecasting model should be fitted on the given series. If False, the model must already be fitted.

  • forecast_horizon (int) – The forecast horizon for the predictions.

  • start (Union[Timestamp, float, int, None]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types: float, int and pandas.Timestamp. In the case of float, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case of int, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case of pandas.Timestamp, this time stamp will be used to determine the first prediction time directly.

  • start_format (Literal[‘position’, ‘value’]) – Defines the start format. Only effective when start is an integer and series is indexed with a pd.RangeIndex. If set to ‘position’, start corresponds to the index position of the first predicted point and can range from (-len(series), len(series) - 1). If set to ‘value’, start corresponds to the index value/label of the first predicted point. Will raise an error if the value is not in series’ index. Default: ‘value’

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • verbose (bool) – Whether to print progress.

  • show_warnings (bool) – Whether to show warnings related to historical forecasts optimization, or parameters start and train_length.

  • enable_optimization (bool) – Whether to use the optimized version of historical_forecasts when supported and available. Default: True.

  • model_fit_kwargs – Parameters to be passed on to the forecast model fit() method.

Returns

Fitted model

Return type

self

predict_series(series, past_covariates=None, future_covariates=None, forecast_horizon=1, start=None, start_format='value', num_samples=1, verbose=False, show_warnings=True, enable_optimization=True)[source]

Computes the historical forecasts that would have been obtained by the underlying forecasting model on series.

retrain is set to False if possible (this is not supported by all models). If set to True, it will always re-train the model on the entire available history,

Parameters
  • series (Sequence[TimeSeries]) – The sequence of series to score on.

  • past_covariates (Optional[Sequence[TimeSeries]]) – Optionally, a sequence of past-observed covariate series or sequence of series. This applies only to models that support past covariates.

  • future_covariates (Optional[Sequence[TimeSeries]]) – Optionally, a sequence of future-known covariate series or sequence of series. This applies only to models that support future covariates.

  • forecast_horizon (int) – The forecast horizon for the predictions.

  • start (Union[Timestamp, float, int, None]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types: float, int and pandas.Timestamp. In the case of float, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case of int, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case of pandas.Timestamp, this time stamp will be used to determine the first prediction time directly.

  • start_format (Literal[‘position’, ‘value’]) – Defines the start format. Only effective when start is an integer and series is indexed with a pd.RangeIndex. If set to ‘position’, start corresponds to the index position of the first predicted point and can range from (-len(series), len(series) - 1). If set to ‘value’, start corresponds to the index value/label of the first predicted point. Will raise an error if the value is not in series’ index. Default: ‘value’

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • verbose (bool) – Whether to print progress.

  • show_warnings (bool) – Whether to show warnings related to historical forecasts optimization, or parameters start and train_length.

  • enable_optimization (bool) – Whether to use the optimized version of historical_forecasts when supported and available. Default: True.

Returns

A sequence of TimeSeries with the historical forecasts for each series (with last_points_only=True).

Return type

Sequence[TimeSeries]

score(series, past_covariates=None, future_covariates=None, forecast_horizon=1, start=None, start_format='value', num_samples=1, verbose=False, show_warnings=True, enable_optimization=True, return_model_prediction=False)[source]

Compute anomaly score(s) for the given series.

Predicts the given target time series with the forecasting model, and applies the scorer(s) on the prediction and the target input time series.

Parameters
  • series (Union[TimeSeries, Sequence[TimeSeries]]) – The (sequence of) series to score on.

  • past_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a (sequence of) past-observed covariate series or sequence of series. This applies only to models that support past covariates.

  • future_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a (sequence of) future-known covariate series or sequence of series. This applies only to models that support future covariates.

  • forecast_horizon (int) – The forecast horizon for the predictions.

  • start (Union[Timestamp, float, int, None]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types: float, int and pandas.Timestamp. In the case of float, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case of int, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case of pandas.Timestamp, this time stamp will be used to determine the first prediction time directly.

  • start_format (Literal[‘position’, ‘value’]) – Defines the start format. Only effective when start is an integer and series is indexed with a pd.RangeIndex. If set to ‘position’, start corresponds to the index position of the first predicted point and can range from (-len(series), len(series) - 1). If set to ‘value’, start corresponds to the index value/label of the first predicted point. Will raise an error if the value is not in series’ index. Default: ‘value’

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • verbose (bool) – Whether to print progress.

  • show_warnings (bool) – Whether to show warnings related to historical forecasts optimization, or parameters start and train_length.

  • enable_optimization (bool) – Whether to use the optimized version of historical_forecasts when supported and available. Default: True.

  • return_model_prediction (bool) – Whether to return the forecasting model prediction along with the anomaly scores.

Return type

Union[TimeSeries, Sequence[TimeSeries], Sequence[Sequence[TimeSeries]]]

Returns

  • TimeSeries – A single TimeSeries for a single series with a single anomaly scorers.

  • Sequence[TimeSeries] – A sequence of TimeSeries for:

    • a single series with multiple anomaly scorers.

    • a sequence of series with a single anomaly scorer.

  • Sequence[Sequence[TimeSeries]] – A sequence of sequences of TimeSeries for a sequence of series and multiple anomaly scorers. The outer sequence is over the series, and inner sequence is over the scorers.

property scorers_are_trainable

Whether any of the Scorers is trainable.

property scorers_are_univariate

Whether any of the Scorers is univariate.

show_anomalies(series, past_covariates=None, future_covariates=None, forecast_horizon=1, start=None, start_format='value', num_samples=1, verbose=False, show_warnings=True, enable_optimization=True, anomalies=None, names_of_scorers=None, title=None, metric=None, **score_kwargs)[source]

Plot the results of the anomaly model.

Computes the score on the given series input and shows the different anomaly scores with respect to time.

The plot will be composed of the following:

  • the series itself with the output of the forecasting model.

  • the anomaly score for each scorer. The scorers with different windows will be separated.

  • the actual anomalies, if given.

It is possible to:

  • add a title to the figure with the parameter title

  • give personalized names for the scorers with names_of_scorers

  • show the results of a metric for each anomaly score (AUC_ROC or AUC_PR), if the actual anomalies are provided.

Parameters
  • series (TimeSeries) – The series to visualize anomalies from.

  • past_covariates (Optional[TimeSeries]) – Optionally, a past-observed covariate series or sequence of series. This applies only to models that support past covariates.

  • future_covariates (Optional[TimeSeries]) – Optionally, a future-known covariate series or sequence of series. This applies only to models that support future covariates.

  • forecast_horizon (int) – The forecast horizon for the predictions.

  • start (Union[Timestamp, float, int, None]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types: float, int and pandas.Timestamp. In the case of float, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case of int, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case of pandas.Timestamp, this time stamp will be used to determine the first prediction time directly.

  • start_format (Literal[‘position’, ‘value’]) – Defines the start format. Only effective when start is an integer and series is indexed with a pd.RangeIndex. If set to ‘position’, start corresponds to the index position of the first predicted point and can range from (-len(series), len(series) - 1). If set to ‘value’, start corresponds to the index value/label of the first predicted point. Will raise an error if the value is not in series’ index. Default: ‘value’

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • verbose (bool) – Whether to print progress.

  • show_warnings (bool) – Whether to show warnings related to historical forecasts optimization, or parameters start and train_length.

  • enable_optimization (bool) – Whether to use the optimized version of historical_forecasts when supported and available. Default: True.

  • anomalies (Optional[TimeSeries]) – The ground truth of the anomalies (1 if it is an anomaly and 0 if not).

  • names_of_scorers (Union[str, Sequence[str], None]) – Name of the scores. Must be a list of length equal to the number of scorers in the anomaly_model.

  • title (Optional[str]) – Title of the figure.

  • metric (Optional[Literal[‘AUC_ROC’, ‘AUC_PR’]]) – Optionally, the name of the metric function to use. Must be one of “AUC_ROC” (Area Under the Receiver Operating Characteristic Curve) and “AUC_PR” (Average Precision from scores). Default: “AUC_ROC”.

  • score_kwargs – parameters for the score() method.