Horizon-Based Training Dataset

class darts.utils.data.horizon_based_dataset.HorizonBasedDataset(target_series, covariates=None, output_chunk_length=12, lh=(1, 3), lookback=3, use_static_covariates=True)[source]

Bases: PastCovariatesTrainingDataset

A time series dataset containing tuples of (past_target, past_covariates, static_covariates, future_target) arrays, in a way inspired by the N-BEATS way of training on the M4 dataset: https://arxiv.org/abs/1905.10437.

The “past” series have length lookback * output_chunk_length, and the “future” series has length output_chunk_length.

Given the horizon output_chunk_length of a model, this dataset will compute some “past/future” splits as follows: First a “forecast point” is selected in the the range of the last (min_lh * output_chunk_length, max_lh * output_chunk_length) points before the end of the time series. The “future” then consists in the following output_chunk_length points, and the “past” will be the preceding lookback * output_chunk_length points.

All the series in the provided sequence must be long enough; i.e. have length at least (lookback + max_lh) * output_chunk_length, and min_lh must be at least 1 (to have targets of length exactly 1 * output_chunk_length). The target and covariates time series are sliced together using their time indexes for alignment.

The sampling is uniform both over the number of time series and the number of samples per series; i.e. the i-th sample of this dataset has 1/(N*M) chance of coming from any of the M samples in any of the N time series in the sequence.

Parameters
  • target_series (Union[TimeSeries, Sequence[TimeSeries]]) – One or a sequence of target TimeSeries.

  • covariates (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, one or a sequence of TimeSeries containing past-observed covariates. If this parameter is set, the provided sequence must have the same length as that of target_series. Moreover, all covariates in the sequence must have a time span large enough to contain all the required slices. The joint slicing of the target and covariates is relying on the time axes of both series.

  • output_chunk_length (int) – The length of the “output” series emitted by the model

  • lh (Tuple[int, int]) – A (min_lh, max_lh) interval for the forecast point, starting from the end of the series. For example, (1, 3) will select forecast points uniformly between 1*H and 3*H points before the end of the series. It is required that min_lh >= 1.

  • lookback (int) – A integer interval for the length of the input in the emitted input and output splits, expressed as a multiple of output_chunk_length. For instance, lookback=3 will emit “inputs” of lengths 3 * output_chunk_length.

  • use_static_covariates (bool) – Whether to use/include static covariate data from input series.