Forecasting Anomaly Model¶
A ForecastingAnomalyModel
wraps around a Darts forecasting model and one or several anomaly
scorer(s) to compute anomaly scores by comparing how actuals deviate from the model’s forecasts.
- class darts.ad.anomaly_model.forecasting_am.ForecastingAnomalyModel(model, scorer)[source]¶
Bases:
darts.ad.anomaly_model.anomaly_model.AnomalyModel
Forecasting-based Anomaly Detection Model
The forecasting model may or may not be already fitted. The underlying assumption is that model should be able to accurately forecast the series in the absence of anomalies. For this reason, it is recommended to either provide a model that has already been fitted and evaluated to work appropriately on a series without anomalies, or to ensure that a simple call to the
fit()
method of the model will be sufficient to train it to satisfactory performance on a series without anomalies.Calling
fit()
on the anomaly model will fit the underlying forecasting model only ifallow_model_training
is set toTrue
upon callingfit()
. In addition, callingfit()
will also fit the fittable scorers, if any.- Parameters
model (
ForecastingModel
) – An instance of a Darts forecasting model.scorer (
Union
[AnomalyScorer
,Sequence
[AnomalyScorer
]]) – One or multiple scorer(s) that will be used to compare the actual and predicted time series in order to obtain an anomaly scoreTimeSeries
. If a list of N scorers is given, the anomaly model will call each one of the scorers and output a list of N anomaly scoresTimeSeries
.
Methods
eval_accuracy
(actual_anomalies, series[, ...])Compute the accuracy of the anomaly scores computed by the model.
fit
(series[, past_covariates, ...])Fit the underlying forecasting model (if applicable) and the fittable scorers, if any.
score
(series[, past_covariates, ...])Compute anomaly score(s) for the given series.
show_anomalies
(series[, past_covariates, ...])Plot the results of the anomaly model.
- eval_accuracy(actual_anomalies, series, past_covariates=None, future_covariates=None, forecast_horizon=1, start=0.5, num_samples=1, metric='AUC_ROC')[source]¶
Compute the accuracy of the anomaly scores computed by the model.
Predicts the series with the forecasting model, and applies the scorer(s) on the predicted time series and the given target time series. Returns the score(s) of an agnostic threshold metric, based on the anomaly score given by the scorer(s).
- Parameters
actual_anomalies (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – The (sequence of) ground truth of the anomalies (1 if it is an anomaly and 0 if not)series (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – The (sequence of) series to predict anomalies on.past_covariates (
Union
[TimeSeries
,Sequence
[TimeSeries
],None
]) – An optional past-observed covariate series or sequence of series. This applies only if the model supports past covariates.future_covariates (
Union
[TimeSeries
,Sequence
[TimeSeries
],None
]) – An optional future-known covariate series or sequence of series. This applies only if the model supports future covariates.forecast_horizon (
int
) – The forecast horizon for the predictions.start (
Union
[Timestamp
,float
,int
]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types:float
,int
andpandas.Timestamp
. In the case offloat
, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case ofint
, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case ofpandas.Timestamp
, this time stamp will be used to determine the first prediction time directly.num_samples (
int
) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.metric (
str
) – Optionally, Scoring function to use. Must be one of “AUC_ROC” and “AUC_PR”. Default: “AUC_ROC”
- Return type
Union
[Dict
[str
,float
],Dict
[str
,Sequence
[float
]],Sequence
[Dict
[str
,float
]],Sequence
[Dict
[str
,Sequence
[float
]]]]- Returns
Union[Dict[str, float], Dict[str, Sequence[float]], Sequence[Dict[str, float]],
Sequence[Dict[str, Sequence[float]]]] – Score for the time series. A (sequence of) dictionary with the keys being the name of the scorers, and the values being the metric results on the (sequence of) series. If the scorer treats every dimension independently (by nature of the scorer or if its component_wise is set to True), the values of the dictionary will be a Sequence containing the score for each dimension.
- fit(series, past_covariates=None, future_covariates=None, allow_model_training=False, forecast_horizon=1, start=0.5, num_samples=1, **model_fit_kwargs)[source]¶
Fit the underlying forecasting model (if applicable) and the fittable scorers, if any.
Train the model (if not already fitted and
allow_model_training
is set to True) and the scorer(s) (if fittable) on the given time series.Once the model is fitted, the series historical forecasts are computed, representing what would have been forecasted by this model on the series.
The prediction and the series are then used to train the scorer(s).
- Parameters
series (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – One or multiple (if the model supports it) target series to be trained on (generally assumed to be anomaly-free).past_covariates (
Union
[TimeSeries
,Sequence
[TimeSeries
],None
]) – Optional past-observed covariate series or sequence of series. This applies only if the model supports past covariates.future_covariates (
Union
[TimeSeries
,Sequence
[TimeSeries
],None
]) – Optional future-known covariate series or sequence of series. This applies only if the model supports future covariates.allow_model_training (
bool
) – Boolean value that indicates if the forecasting model needs to be fitted on the given series. If set to False, the model needs to be already fitted. Default: Falseforecast_horizon (
int
) – The forecast horizon for the predictions.start (
Union
[Timestamp
,float
,int
]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types:float
,int
andpandas.Timestamp
. In the case offloat
, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case ofint
, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case ofpandas.Timestamp
, this time stamp will be used to determine the first prediction time directly. Default: 0.5num_samples (
int
) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.model_fit_kwargs – Parameters to be passed on to the forecast model
fit()
method.
- Returns
Fitted model
- Return type
self
- score(series, past_covariates=None, future_covariates=None, forecast_horizon=1, start=0.5, num_samples=1, return_model_prediction=False)[source]¶
Compute anomaly score(s) for the given series.
Predicts the given target time series with the forecasting model, and applies the scorer(s) on the prediction and the target input time series. Outputs the anomaly score of the given input time series.
- Parameters
series (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – The (sequence of) series to score on.past_covariates (
Union
[TimeSeries
,Sequence
[TimeSeries
],None
]) – An optional past-observed covariate series or sequence of series. This applies only if the model supports past covariates.future_covariates (
Union
[TimeSeries
,Sequence
[TimeSeries
],None
]) – An optional future-known covariate series or sequence of series. This applies only if the model supports future covariates.forecast_horizon (
int
) – The forecast horizon for the predictions.start (
Union
[Timestamp
,float
,int
]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types:float
,int
andpandas.Timestamp
. In the case offloat
, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case ofint
, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case ofpandas.Timestamp
, this time stamp will be used to determine the first prediction time directly. Default: 0.5num_samples (
int
) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.return_model_prediction (
bool
) – Boolean value indicating if the prediction of the model should be returned along the anomaly score Default: False
- Returns
Anomaly scores series generated by the anomaly model scorers
TimeSeries
if series is a series, and the anomaly model contains one scorer.Sequence[TimeSeries]
if series is a series, and the anomaly model contains multiple scorers, returns one series per scorer.
if series is a sequence, and the anomaly model contains one scorer, returns one series per series in the sequence.
Sequence[Sequence[TimeSeries]]
if series is a sequence, and the anomaly model contains multiple scorers. The outer sequence is over the series, and inner sequence is over the scorers.
- Return type
Union[TimeSeries, Sequence[TimeSeries], Sequence[Sequence[TimeSeries]]]
- show_anomalies(series, past_covariates=None, future_covariates=None, forecast_horizon=1, start=0.5, num_samples=1, actual_anomalies=None, names_of_scorers=None, title=None, metric=None)[source]¶
Plot the results of the anomaly model.
Computes the score on the given series input and shows the different anomaly scores with respect to time.
The plot will be composed of the following:
the series itself with the output of the forecasting model.
the anomaly score for each scorer. The scorers with different windows will be separated.
the actual anomalies, if given.
It is possible to:
add a title to the figure with the parameter title
give personalized names for the scorers with names_of_scorers
- show the results of a metric for each anomaly score (AUC_ROC or AUC_PR),
if the actual anomalies are provided.
- Parameters
series (
TimeSeries
) – The series to visualize anomalies from.past_covariates (
Optional
[TimeSeries
]) – An optional past-observed covariate series or sequence of series. This applies only if the model supports past covariates.future_covariates (
Optional
[TimeSeries
]) – An optional future-known covariate series or sequence of series. This applies only if the model supports future covariates.forecast_horizon (
int
) – The forecast horizon for the predictions.start (
Union
[Timestamp
,float
,int
]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types:float
,int
andpandas.Timestamp
. In the case offloat
, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case ofint
, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case ofpandas.Timestamp
, this time stamp will be used to determine the first prediction time directly.num_samples (
int
) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.actual_anomalies (
Optional
[TimeSeries
]) – The ground truth of the anomalies (1 if it is an anomaly and 0 if not)names_of_scorers (
Union
[str
,Sequence
[str
],None
]) – Name of the scores. Must be a list of length equal to the number of scorers in the anomaly_model.title (
Optional
[str
]) – Title of the figuremetric (
Optional
[str
]) – Optionally, Scoring function to use. Must be one of “AUC_ROC” and “AUC_PR”. Default: “AUC_ROC”