Temporal Fusion Transformer (TFT)

class darts.models.forecasting.tft_model.TFTModel(input_chunk_length, output_chunk_length, hidden_size=16, lstm_layers=1, num_attention_heads=4, full_attention=False, feed_forward='GatedResidualNetwork', dropout=0.1, hidden_continuous_size=8, add_relative_index=False, loss_fn=None, likelihood=None, **kwargs)[source]

Bases: darts.models.forecasting.torch_forecasting_model.MixedCovariatesTorchModel

Temporal Fusion Transformers (TFT) for Interpretable Time Series Forecasting.

This is an implementation of the TFT architecture, as outlined in [1].

The internal sub models are adopted from pytorch-forecasting’s TemporalFusionTransformer implementation.

This model supports mixed covariates (includes past covariates known for input_chunk_length points before prediction time and future covariates known for output_chunk_length after prediction time).

The TFT applies multi-head attention queries on future inputs from mandatory future_covariates. Specifying future encoders with add_encoders (read below) can automatically generate future covariates and allows to use the model without having to pass any future_covariates to fit() and predict().

By default, this model uses the QuantileRegression likelihood, which means that its forecasts are probabilistic; it is recommended to call :func`predict()` with num_samples >> 1 to get meaningful results.

Parameters
  • input_chunk_length (int) – Encoder length; number of past time steps that are fed to the forecasting module at prediction time.

  • output_chunk_length (int) – Decoder length; number of future time steps that are fed to the forecasting module at prediction time.

  • hidden_size (int) – Hidden state size of the TFT. It is the main hyper-parameter and common across the internal TFT architecture.

  • lstm_layers (int) – Number of layers for the Long Short Term Memory (LSTM) Encoder and Decoder (1 is a good default).

  • num_attention_heads (int) – Number of attention heads (4 is a good default)

  • full_attention (bool) – If True, applies multi-head attention query on past (encoder) and future (decoder) parts. Otherwise, only queries on future part. Defaults to False.

  • feed_forward (str) –

    A feedforward network is a fully-connected layer with an activation. TFT Can be one of the glu variant’s FeedForward Network (FFN)[2]. The glu variant’s FeedForward Network are a series of FFNs designed to work better with Transformer based models. Defaults to "GatedResidualNetwork".

    [“GLU”, “Bilinear”, “ReGLU”, “GEGLU”, “SwiGLU”, “ReLU”, “GELU”]

    or the TFT original FeedForward Network.

    [“GatedResidualNetwork”]

  • dropout (float) – Fraction of neurons affected by dropout. This is compatible with Monte Carlo dropout at inference time for model uncertainty estimation (enabled with mc_dropout=True at prediction time).

  • hidden_continuous_size (int) – Default for hidden size for processing continuous variables

  • add_relative_index (bool) – Whether to add positional values to future covariates. Defaults to False. This allows to use the TFTModel without having to pass future_covariates to :fun:`fit()` and train(). It gives a value to the position of each step from input and output chunk relative to the prediction point. The values are normalized with input_chunk_length.

  • loss_fn (nn.Module) – PyTorch loss function used for training. By default, the TFT model is probabilistic and uses a likelihood instead (QuantileRegression). To make the model deterministic, you can set the ` likelihood` to None and give a loss_fn argument.

  • torch_metrics – A torch metric or a MetricCollection used for evaluation. A full list of available metrics can be found at https://torchmetrics.readthedocs.io/en/latest/. Default: None.

  • likelihood (Optional[Likelihood]) – The likelihood model to be used for probabilistic forecasts. By default, the TFT uses a QuantileRegression likelihood.

  • **kwargs – Optional arguments to initialize the pytorch_lightning.Module, pytorch_lightning.Trainer, and Darts’ TorchForecastingModel.

  • optimizer_cls – The PyTorch optimizer class to be used. Default: torch.optim.Adam.

  • optimizer_kwargs – Optionally, some keyword arguments for the PyTorch optimizer (e.g., {'lr': 1e-3} for specifying a learning rate). Otherwise, the default values of the selected optimizer_cls will be used. Default: None.

  • lr_scheduler_cls – Optionally, the PyTorch learning rate scheduler class to be used. Specifying None corresponds to using a constant learning rate. Default: None.

  • lr_scheduler_kwargs – Optionally, some keyword arguments for the PyTorch learning rate scheduler. Default: None.

  • batch_size – Number of time series (input and output sequences) used in each training pass. Default: 32.

  • n_epochs – Number of epochs over which to train the model. Default: 100.

  • model_name – Name of the model. Used for creating checkpoints and saving tensorboard data. If not specified, defaults to the following string "YYYY-mm-dd_HH:MM:SS_torch_model_run_PID", where the initial part of the name is formatted with the local date and time, while PID is the processed ID (preventing models spawned at the same time by different processes to share the same model_name). E.g., "2021-06-14_09:53:32_torch_model_run_44607".

  • work_dir – Path of the working directory, where to save checkpoints and Tensorboard summaries. Default: current working directory.

  • log_tensorboard – If set, use Tensorboard to log the different parameters. The logs will be located in: "{work_dir}/darts_logs/{model_name}/logs/". Default: False.

  • nr_epochs_val_period – Number of epochs to wait before evaluating the validation loss (if a validation TimeSeries is passed to the fit() method). Default: 1.

  • torch_device_str

    Optionally, a string indicating the torch device to use. By default, torch_device_str is None which will run on CPU. Set it to "cuda" to use all available GPUs or "cuda:i" to only use GPU i (i must be an integer). For example “cuda:0” will use the first GPU only.

    Deprecated since version v0.17.0: torch_device_str has been deprecated in v0.17.0 and will be removed in a future version. Instead, specify this with keys "accelerator", "gpus", "auto_select_gpus" in your pl_trainer_kwargs dict. Some examples for setting the devices inside the pl_trainer_kwargs dict:

    • {"accelerator": "cpu"} for CPU,

    • {"accelerator": "gpu", "gpus": [i]} to use only GPU i (i must be an integer),

    • {"accelerator": "gpu", "gpus": -1, "auto_select_gpus": True} to use all available GPUS.

    For more info, see here: https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#trainer-flags , and https://pytorch-lightning.readthedocs.io/en/stable/advanced/multi_gpu.html#select-gpu-devices

  • force_reset – If set to True, any previously-existing model with the same name will be reset (all checkpoints will be discarded). Default: False.

  • save_checkpoints – Whether or not to automatically save the untrained model and checkpoints from training. To load the model from checkpoint, call MyModelClass.load_from_checkpoint(), where MyModelClass is the TorchForecastingModel class that was used (such as TFTModel, NBEATSModel, etc.). If set to False, the model can still be manually saved using save_model() and loaded using load_model(). Default: False.

  • add_encoders

    A large number of past and future covariates can be automatically generated with add_encoders. This can be done by adding multiple pre-defined index encoders and/or custom user-made functions that will be used as index encoders. Additionally, a transformer such as Darts’ Scaler can be added to transform the generated covariates. This happens all under one hood and only needs to be specified at model creation. Read SequentialEncoder to find out more about add_encoders. Default: None. An example showing some of add_encoders features:

    add_encoders={
        'cyclic': {'future': ['month']},
        'datetime_attribute': {'future': ['hour', 'dayofweek']},
        'position': {'past': ['absolute'], 'future': ['relative']},
        'custom': {'past': [lambda idx: (idx.year - 1950) / 50]},
        'transformer': Scaler()
    }
    

  • random_state – Control the randomness of the weight’s initialization. Check this link for more details. Default: None.

  • pl_trainer_kwargs

    By default TorchForecastingModel creates a PyTorch Lightning Trainer with several useful presets that performs the training, validation and prediction processes. These presets include automatic checkpointing, tensorboard logging, setting the torch device and more. With pl_trainer_kwargs you can add additional kwargs to instantiate the PyTorch Lightning trainer object. Check the PL Trainer documentation for more information about the supported kwargs. Default: None. With parameter "callbacks" you can add custom or PyTorch-Lightning built-in callbacks to Darts’ TorchForecastingModel. Below is an example for adding EarlyStopping to the training process. The model will stop training early if the validation loss val_loss does not improve beyond specifications. For more information on callbacks, visit: PyTorch Lightning Callbacks

    from pytorch_lightning.callbacks.early_stopping import EarlyStopping
    
    # stop training when validation loss does not decrease more than 0.05 (`min_delta`) over
    # a period of 5 epochs (`patience`)
    my_stopper = EarlyStopping(
        monitor="val_loss",
        patience=5,
        min_delta=0.05,
        mode='min',
    )
    
    pl_trainer_kwargs={"callbacks": [my_stopper]}
    

    Note that you can also use a custom PyTorch Lightning Trainer for training and prediction with optional parameter trainer in fit() and predict().

  • show_warnings – whether to show warnings raised from PyTorch Lightning. Useful to detect potential issues of your forecasting use case. Default: False.

References

1

https://arxiv.org/pdf/1912.09363.pdf

..[2] Shazeer, Noam, “GLU Variants Improve Transformer”, 2020. arVix https://arxiv.org/abs/2002.05202.

Attributes

epochs_trained

input_chunk_length

likelihood

model_created

model_params

output_chunk_length

Methods

backtest(series[, past_covariates, ...])

Compute error values that the model would have produced when used on series.

fit(series[, past_covariates, ...])

Fit/train the model on one or multiple series.

fit_from_dataset(train_dataset[, ...])

Train the model with a specific darts.utils.data.TrainingDataset instance.

gridsearch(parameters, series[, ...])

Find the best hyper-parameters among a given set using a grid search.

historical_forecasts(series[, ...])

Compute the historical forecasts that would have been obtained by this model on the series.

load_from_checkpoint(model_name[, work_dir, ...])

Load the model from automatically saved checkpoints under '{work_dir}/darts_logs/{model_name}/checkpoints/'.

load_model(path)

loads a model from a given file path.

predict(n, *args, **kwargs)

Predict the n time step following the end of the training series, or of the specified series.

predict_from_dataset(n, input_series_dataset)

This method allows for predicting with a specific darts.utils.data.InferenceDataset instance.

reset_model()

Resets the model object and removes all stored data - model, checkpoints, loggers and training history.

residuals(series[, forecast_horizon, verbose])

Compute the residuals produced by this model on a univariate time series.

save_model(path)

Saves the model under a given path.

backtest(series, past_covariates=None, future_covariates=None, num_samples=1, train_length=None, start=0.5, forecast_horizon=1, stride=1, retrain=True, overlap_end=False, last_points_only=False, metric=<function mape>, reduction=<function mean>, verbose=False)

Compute error values that the model would have produced when used on series.

It repeatedly builds a training set from the beginning of series. It trains the current model on the training set, emits a forecast of length equal to forecast_horizon, and then moves the end of the training set forward by stride time steps. A metric (given by the metric function) is then evaluated on the forecast and the actual values. Finally, the method returns a reduction (the mean by default) of all these metric scores.

By default, this method uses each historical forecast (whole) to compute error scores. If last_points_only is set to True, it will use only the last point of each historical forecast. In this case, no reduction is used.

By default, this method always re-trains the models on the entire available history, corresponding to an expanding window strategy. If retrain is set to False (useful for models for which training might be time-consuming, such as deep learning models), the model will only be trained on the initial training window (up to start time stamp), and only if it has not been trained before. Then, at every iteration, the newly expanded input sequence will be fed to the model to produce the new output.

Parameters
  • series (TimeSeries) – The target time series to use to successively train and evaluate the historical forecasts

  • past_covariates (Optional[TimeSeries]) – An optional past-observed covariate series. This applies only if the model supports past covariates.

  • future_covariates (Optional[TimeSeries]) – An optional future-known covariate series. This applies only if the model supports future covariates.

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • train_length (Optional[int]) – Number of time steps in our training set (size of backtesting window to train on). Default is set to train_length=None where it takes all available time steps up until prediction time, otherwise the moving window strategy is used. If larger than the number of time steps available, all steps up until prediction time are used, as in default case. Needs to be at least min_train_series_length.

  • start (Union[Timestamp, float, int]) – The first prediction time, at which a prediction is computed for a future time. This parameter supports 3 different types: float, int and pandas.Timestamp. In the case of float, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case of int, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case of pandas.Timestamp, this time stamp will be used to determine the first prediction time directly.

  • forecast_horizon (int) – The forecast horizon for the point prediction.

  • stride (int) – The number of time steps between two consecutive training sets.

  • retrain (bool) – Whether to retrain the model for every prediction or not. Not all models support setting retrain to False. Notably, this is supported by neural networks based models.

  • overlap_end (bool) – Whether the returned forecasts can go beyond the series’ end or not

  • last_points_only (bool) – Whether to use the whole historical forecasts or only the last point of each forecast to compute the error

  • metric (Callable[[TimeSeries, TimeSeries], float]) – A function that takes two TimeSeries instances as inputs and returns an error value.

  • reduction (Optional[Callable[[ndarray], float]]) – A function used to combine the individual error scores obtained when last_points_only is set to False. If explicitely set to None, the method will return a list of the individual error scores instead. Set to np.mean by default.

  • verbose (bool) – Whether to print progress

Returns

The error score, or the list of individual error scores if reduction is None

Return type

float or List[float]

property epochs_trained: int
Return type

int

fit(series, past_covariates=None, future_covariates=None, val_series=None, val_past_covariates=None, val_future_covariates=None, trainer=None, verbose=None, epochs=0, max_samples_per_ts=None, num_loader_workers=0)

Fit/train the model on one or multiple series.

This method wraps around fit_from_dataset(), constructing a default training dataset for this model. If you need more control on how the series are sliced for training, consider calling fit_from_dataset() with a custom darts.utils.data.TrainingDataset.

Training is performed with a PyTorch Lightning Trainer. It uses a default Trainer object from presets and pl_trainer_kwargs used at model creation. You can also use a custom Trainer with optional parameter trainer. For more information on PyTorch Lightning Trainers check out this link .

This function can be called several times to do some extra training. If epochs is specified, the model will be trained for some (extra) epochs epochs.

Below, all possible parameters are documented, but not all models support all parameters. For instance, all the PastCovariatesTorchModel support only past_covariates and not future_covariates. Darts will complain if you try fitting a model with the wrong covariates argument.

When handling covariates, Darts will try to use the time axes of the target and the covariates to come up with the right time slices. So the covariates can be longer than needed; as long as the time axes are correct Darts will handle them correctly. It will also complain if their time span is not sufficient.

Parameters
  • series (Union[TimeSeries, Sequence[TimeSeries]]) – A series or sequence of series serving as target (i.e. what the model will be trained to forecast)

  • past_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a series or sequence of series specifying past-observed covariates

  • future_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, a series or sequence of series specifying future-known covariates

  • val_series (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, one or a sequence of validation target series, which will be used to compute the validation loss throughout training and keep track of the best performing models.

  • val_past_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, the past covariates corresponding to the validation series (must match covariates)

  • val_future_covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, the future covariates corresponding to the validation series (must match covariates)

  • trainer (Optional[Trainer]) – Optionally, a custom PyTorch-Lightning Trainer object to perform training. Using a custom trainer will override Darts’ default trainer.

  • verbose (Optional[bool]) – Optionally, whether to print progress.

  • epochs (int) – If specified, will train the model for epochs (additional) epochs, irrespective of what n_epochs was provided to the model constructor.

  • max_samples_per_ts (Optional[int]) – Optionally, a maximum number of samples to use per time series. Models are trained in a supervised fashion by constructing slices of (input, output) examples. On long time series, this can result in unnecessarily large number of training samples. This parameter upper-bounds the number of training samples per time series (taking only the most recent samples in each series). Leaving to None does not apply any upper bound.

  • num_loader_workers (int) – Optionally, an integer specifying the num_workers to use in PyTorch DataLoader instances, both for the training and validation loaders (if any). A larger number of workers can sometimes increase performance, but can also incur extra overheads and increase memory usage, as more batches are loaded in parallel.

Returns

Fitted model.

Return type

self

fit_from_dataset(train_dataset, val_dataset=None, trainer=None, verbose=None, epochs=0, num_loader_workers=0)

Train the model with a specific darts.utils.data.TrainingDataset instance. These datasets implement a PyTorch Dataset, and specify how the target and covariates are sliced for training. If you are not sure which training dataset to use, consider calling fit() instead, which will create a default training dataset appropriate for this model.

Training is performed with a PyTorch Lightning Trainer. It uses a default Trainer object from presets and pl_trainer_kwargs used at model creation. You can also use a custom Trainer with optional parameter trainer. For more information on PyTorch Lightning Trainers check out this link .

This function can be called several times to do some extra training. If epochs is specified, the model will be trained for some (extra) epochs epochs.

Parameters
  • train_dataset (TrainingDataset) – A training dataset with a type matching this model (e.g. PastCovariatesTrainingDataset for PastCovariatesTorchModel).

  • val_dataset (Optional[TrainingDataset]) – A training dataset with a type matching this model (e.g. PastCovariatesTrainingDataset for :class:`PastCovariatesTorchModel`s), representing the validation set (to track the validation loss).

  • trainer (Optional[Trainer]) – Optionally, a custom PyTorch-Lightning Trainer object to perform prediction. Using a custom trainer will override Darts’ default trainer.

  • verbose (Optional[bool]) – Optionally, whether to print progress.

  • epochs (int) – If specified, will train the model for epochs (additional) epochs, irrespective of what n_epochs was provided to the model constructor.

  • num_loader_workers (int) – Optionally, an integer specifying the num_workers to use in PyTorch DataLoader instances, both for the training and validation loaders (if any). A larger number of workers can sometimes increase performance, but can also incur extra overheads and increase memory usage, as more batches are loaded in parallel.

Returns

Fitted model.

Return type

self

classmethod gridsearch(parameters, series, past_covariates=None, future_covariates=None, forecast_horizon=None, stride=1, start=0.5, last_points_only=False, val_series=None, use_fitted_values=False, metric=<function mape>, reduction=<function mean>, verbose=False, n_jobs=1, n_random_samples=None)

Find the best hyper-parameters among a given set using a grid search.

This function has 3 modes of operation: Expanding window mode, split mode and fitted value mode. The three modes of operation evaluate every possible combination of hyper-parameter values provided in the parameters dictionary by instantiating the model_class subclass of ForecastingModel with each combination, and returning the best-performing model with regard to the metric function. The metric function is expected to return an error value, thus the model resulting in the smallest metric output will be chosen.

The relationship of the training data and test data depends on the mode of operation.

Expanding window mode (activated when forecast_horizon is passed): For every hyperparameter combination, the model is repeatedly trained and evaluated on different splits of series. This process is accomplished by using the backtest() function as a subroutine to produce historic forecasts starting from start that are compared against the ground truth values of series. Note that the model is retrained for every single prediction, thus this mode is slower.

Split window mode (activated when val_series is passed): This mode will be used when the val_series argument is passed. For every hyper-parameter combination, the model is trained on series and evaluated on val_series.

Fitted value mode (activated when use_fitted_values is set to True): For every hyper-parameter combination, the model is trained on series and evaluated on the resulting fitted values. Not all models have fitted values, and this method raises an error if the model doesn’t have a fitted_values member. The fitted values are the result of the fit of the model on series. Comparing with the fitted values can be a quick way to assess the model, but one cannot see if the model is overfitting the series.

Derived classes must ensure that a single instance of a model will not share parameters with the other instances, e.g., saving models in the same path. Otherwise, an unexpected behavior can arise while running several models in parallel (when n_jobs != 1). If this cannot be avoided, then gridsearch should be redefined, forcing n_jobs = 1.

Currently this method only supports deterministic predictions (i.e. when models’ predictions have only 1 sample).

Parameters
  • model_class – The ForecastingModel subclass to be tuned for ‘series’.

  • parameters (dict) – A dictionary containing as keys hyperparameter names, and as values lists of values for the respective hyperparameter.

  • series (TimeSeries) – The TimeSeries instance used as input and target for training.

  • past_covariates (Optional[TimeSeries]) – An optional past-observed covariate series. This applies only if the model supports past covariates.

  • future_covariates (Optional[TimeSeries]) – An optional future-known covariate series. This applies only if the model supports future covariates.

  • forecast_horizon (Optional[int]) – The integer value of the forecasting horizon. Activates expanding window mode.

  • stride (int) – The number of time steps between two consecutive predictions. Only used in expanding window mode.

  • start (Union[Timestamp, float, int]) – The int, float or pandas.Timestamp that represents the starting point in the time index of series from which predictions will be made to evaluate the model. For a detailed description of how the different data types are interpreted, please see the documentation for ForecastingModel.backtest.

  • last_points_only (bool) – Whether to use the whole forecasts or only the last point of each forecast to compute the error

  • val_series (Optional[TimeSeries]) – The TimeSeries instance used for validation in split mode. If provided, this series must start right after the end of series; so that a proper comparison of the forecast can be made.

  • use_fitted_values (bool) – If True, uses the comparison with the fitted values. Raises an error if fitted_values is not an attribute of model_class.

  • metric (Callable[[TimeSeries, TimeSeries], float]) – A function that takes two TimeSeries instances as inputs (actual and prediction, in this order), and returns a float error value.

  • reduction (Callable[[ndarray], float]) – A reduction function (mapping array to float) describing how to aggregate the errors obtained on the different validation series when backtesting. By default it’ll compute the mean of errors.

  • verbose – Whether to print progress.

  • n_jobs (int) – The number of jobs to run in parallel. Parallel jobs are created only when there are two or more parameters combinations to evaluate. Each job will instantiate, train, and evaluate a different instance of the model. Defaults to 1 (sequential). Setting the parameter to -1 means using all the available cores.

  • n_random_samples (Union[int, float, None]) – The number/ratio of hyperparameter combinations to select from the full parameter grid. This will perform a random search instead of using the full grid. If an integer, n_random_samples is the number of parameter combinations selected from the full grid and must be between 0 and the total number of parameter combinations. If a float, n_random_samples is the ratio of parameter combinations selected from the full grid and must be between 0 and 1. Defaults to None, for which random selection will be ignored.

Returns

A tuple containing an untrained model_class instance created from the best-performing hyper-parameters, along with a dictionary containing these best hyper-parameters, and metric score for the best hyper-parameters.

Return type

ForecastingModel, Dict, float

historical_forecasts(series, past_covariates=None, future_covariates=None, num_samples=1, train_length=None, start=0.5, forecast_horizon=1, stride=1, retrain=True, overlap_end=False, last_points_only=True, verbose=False)

Compute the historical forecasts that would have been obtained by this model on the series.

This method uses an expanding training window; it repeatedly builds a training set from the beginning of series. It trains the model on the training set, emits a forecast of length equal to forecast_horizon, and then moves the end of the training set forward by stride time steps.

By default, this method will return a single time series made up of the last point of each historical forecast. This time series will thus have a frequency of series.freq * stride. If last_points_only is set to False, it will instead return a list of the historical forecasts series.

By default, this method always re-trains the models on the entire available history, corresponding to an expanding window strategy. If retrain is set to False, the model will only be trained on the initial training window (up to start time stamp), and only if it has not been trained before. This is not supported by all models.

Parameters
  • series (TimeSeries) – The target time series to use to successively train and evaluate the historical forecasts.

  • past_covariates (Optional[TimeSeries]) – An optional past-observed covariate series. This applies only if the model supports past covariates.

  • future_covariates (Optional[TimeSeries]) – An optional future-known covariate series. This applies only if the model supports future covariates.

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • train_length (Optional[int]) – Number of time steps in our training set (size of backtesting window to train on). Default is set to train_length=None where it takes all available time steps up until prediction time, otherwise the moving window strategy is used. If larger than the number of time steps available, all steps up until prediction time are used, as in default case. Needs to be at least min_train_series_length.

  • start (Union[Timestamp, float, int]) – The first point of time at which a prediction is computed for a future time. This parameter supports 3 different data types: float, int and pandas.Timestamp. In the case of float, the parameter will be treated as the proportion of the time series that should lie before the first prediction point. In the case of int, the parameter will be treated as an integer index to the time index of series that will be used as first prediction time. In case of pandas.Timestamp, this time stamp will be used to determine the first prediction time directly.

  • forecast_horizon (int) – The forecast horizon for the predictions

  • stride (int) – The number of time steps between two consecutive predictions.

  • retrain (bool) – Whether to retrain the model for every prediction or not. Not all models support setting retrain to False. Notably, this is supported by neural networks based models.

  • overlap_end (bool) – Whether the returned forecasts can go beyond the series’ end or not

  • last_points_only (bool) – Whether to retain only the last point of each historical forecast. If set to True, the method returns a single TimeSeries containing the successive point forecasts. Otherwise returns a list of historical TimeSeries forecasts.

  • verbose (bool) – Whether to print progress

Returns

By default, a single TimeSeries instance created from the last point of each individual forecast. If last_points_only is set to False, a list of the historical forecasts.

Return type

TimeSeries or List[TimeSeries]

property input_chunk_length: int
Return type

int

property likelihood: darts.utils.likelihood_models.Likelihood
Return type

Likelihood

static load_from_checkpoint(model_name, work_dir=None, file_name=None, best=True)

Load the model from automatically saved checkpoints under ‘{work_dir}/darts_logs/{model_name}/checkpoints/’. This method is used for models that were created with save_checkpoints=True.

If you manually saved your model, consider using load_model().

Example for loading a RNNModel from checkpoint (model_name is the model_name used at model creation):

from darts.models import RNNModel

model_loaded = RNNModel.load_from_checkpoint(model_name, best=True)

If file_name is given, returns the model saved under ‘{work_dir}/darts_logs/{model_name}/checkpoints/{file_name}’.

If file_name is not given, will try to restore the best checkpoint (if best is True) or the most recent checkpoint (if best is False from ‘{work_dir}/darts_logs/{model_name}/checkpoints/’.

Parameters
  • model_name (str) – The name of the model (used to retrieve the checkpoints folder’s name).

  • work_dir (Optional[str]) – Working directory (containing the checkpoints folder). Defaults to current working directory.

  • file_name (Optional[str]) – The name of the checkpoint file. If not specified, use the most recent one.

  • best (bool) – If set, will retrieve the best model (according to validation loss) instead of the most recent one. Only is ignored when file_name is given.

Returns

The corresponding trained TorchForecastingModel.

Return type

TorchForecastingModel

static load_model(path)

loads a model from a given file path. The file name should end with ‘.pth.tar’

Example for loading a RNNModel:

from darts.models import RNNModel

model_loaded = RNNModel.load_model("my_model.pth.tar")
Parameters

path (str) – Path under which to save the model at its current state. The path should end with ‘.pth.tar’

Return type

TorchForecastingModel

property model_created: bool
Return type

bool

property model_params: dict
Return type

dict

property output_chunk_length: int
Return type

int

predict(n, *args, **kwargs)[source]

Predict the n time step following the end of the training series, or of the specified series.

Prediction is performed with a PyTorch Lightning Trainer. It uses a default Trainer object from presets and pl_trainer_kwargs used at model creation. You can also use a custom Trainer with optional parameter trainer. For more information on PyTorch Lightning Trainers check out this link .

Below, all possible parameters are documented, but not all models support all parameters. For instance, all the PastCovariatesTorchModel support only past_covariates and not future_covariates. Darts will complain if you try calling predict() on a model with the wrong covariates argument.

Darts will also complain if the provided covariates do not have a sufficient time span. In general, not all models require the same covariates’ time spans:

  • Models relying on past covariates require the last input_chunk_length of the past_covariates
    points to be known at prediction time. For horizon values n > output_chunk_length, these models
    require at least the next n - output_chunk_length future values to be known as well.
  • Models relying on future covariates require the next n values to be known.
    In addition (for DualCovariatesTorchModel and MixedCovariatesTorchModel), they also
    require the “historic” values of these future covariates (over the past input_chunk_length).

When handling covariates, Darts will try to use the time axes of the target and the covariates to come up with the right time slices. So the covariates can be longer than needed; as long as the time axes are correct Darts will handle them correctly. It will also complain if their time span is not sufficient.

Parameters
  • n – The number of time steps after the end of the training time series for which to produce predictions

  • series – Optionally, a series or sequence of series, representing the history of the target series whose future is to be predicted. If specified, the method returns the forecasts of these series. Otherwise, the method returns the forecast of the (single) training series.

  • past_covariates – Optionally, the past-observed covariates series needed as inputs for the model. They must match the covariates used for training in terms of dimension.

  • future_covariates – Optionally, the future-known covariates series needed as inputs for the model. They must match the covariates used for training in terms of dimension.

  • trainer – Optionally, a custom PyTorch-Lightning Trainer object to perform prediction. Using a custom trainer will override Darts’ default trainer.

  • batch_size – Size of batches during prediction. Defaults to the models’ training batch_size value.

  • verbose – Optionally, whether to print progress.

  • n_jobs – The number of jobs to run in parallel. -1 means using all processors. Defaults to 1.

  • roll_size – For self-consuming predictions, i.e. n > output_chunk_length, determines how many outputs of the model are fed back into it at every iteration of feeding the predicted target (and optionally future covariates) back into the model. If this parameter is not provided, it will be set output_chunk_length by default.

  • num_samples – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • num_loader_workers – Optionally, an integer specifying the num_workers to use in PyTorch DataLoader instances, for the inference/prediction dataset loaders (if any). A larger number of workers can sometimes increase performance, but can also incur extra overheads and increase memory usage, as more batches are loaded in parallel.

  • mc_dropout – Optionally, enable monte carlo dropout for predictions using neural network based models. This allows bayesian approximation by specifying an implicit prior over learned models.

Returns

One or several time series containing the forecasts of series, or the forecast of the training series if series is not specified and the model has been trained on a single series.

Return type

Union[TimeSeries, Sequence[TimeSeries]]

predict_from_dataset(n, input_series_dataset, trainer=None, batch_size=None, verbose=None, n_jobs=1, roll_size=None, num_samples=1, num_loader_workers=0, mc_dropout=False)

This method allows for predicting with a specific darts.utils.data.InferenceDataset instance. These datasets implement a PyTorch Dataset, and specify how the target and covariates are sliced for inference. In most cases, you’ll rather want to call predict() instead, which will create an appropriate InferenceDataset for you.

Prediction is performed with a PyTorch Lightning Trainer. It uses a default Trainer object from presets and pl_trainer_kwargs used at model creation. You can also use a custom Trainer with optional parameter trainer. For more information on PyTorch Lightning Trainers check out this link .

Parameters
  • n (int) – The number of time steps after the end of the training time series for which to produce predictions

  • input_series_dataset (InferenceDataset) – Optionally, a series or sequence of series, representing the history of the target series’ whose future is to be predicted. If specified, the method returns the forecasts of these series. Otherwise, the method returns the forecast of the (single) training series.

  • trainer (Optional[Trainer]) – Optionally, a custom PyTorch-Lightning Trainer object to perform prediction. Using a custom trainer will override Darts’ default trainer.

  • batch_size (Optional[int]) – Size of batches during prediction. Defaults to the models batch_size value.

  • verbose (Optional[bool]) – Optionally, whether to print progress.

  • n_jobs (int) – The number of jobs to run in parallel. -1 means using all processors. Defaults to 1.

  • roll_size (Optional[int]) – For self-consuming predictions, i.e. n > output_chunk_length, determines how many outputs of the model are fed back into it at every iteration of feeding the predicted target (and optionally future covariates) back into the model. If this parameter is not provided, it will be set output_chunk_length by default.

  • num_samples (int) – Number of times a prediction is sampled from a probabilistic model. Should be left set to 1 for deterministic models.

  • num_loader_workers (int) – Optionally, an integer specifying the num_workers to use in PyTorch DataLoader instances, for the inference/prediction dataset loaders (if any). A larger number of workers can sometimes increase performance, but can also incur extra overheads and increase memory usage, as more batches are loaded in parallel.

  • mc_dropout (bool) – Optionally, enable monte carlo dropout for predictions using neural network based models. This allows bayesian approximation by specifying an implicit prior over learned models.

Returns

Returns one or more forecasts for time series.

Return type

Sequence[TimeSeries]

reset_model()

Resets the model object and removes all stored data - model, checkpoints, loggers and training history.

residuals(series, forecast_horizon=1, verbose=False)

Compute the residuals produced by this model on a univariate time series.

This function computes the difference between the actual observations from series and the fitted values vector p obtained by training the model on series. For every index i in series, p[i] is computed by training the model on series[:(i - forecast_horizon)] and forecasting forecast_horizon into the future. (p[i] will be set to the last value of the predicted series.) The vector of residuals will be shorter than series due to the minimum training series length required by the model and the gap introduced by forecast_horizon. Most commonly, the term “residuals” implies a value for forecast_horizon of 1; but this can be configured.

This method works only on univariate series and does not currently support covariates. It uses the median prediction (when dealing with stochastic forecasts).

Parameters
  • series (TimeSeries) – The univariate TimeSeries instance which the residuals will be computed for.

  • forecast_horizon (int) – The forecasting horizon used to predict each fitted value.

  • verbose (bool) – Whether to print progress.

Returns

The vector of residuals.

Return type

TimeSeries

save_model(path)

Saves the model under a given path. The path should end with ‘.pth.tar’

Parameters

path (str) – Path under which to save the model at its current state.

Return type

None