Pipeline#

class darts.dataprocessing.pipeline.Pipeline(transformers, copy=False, verbose=None, n_jobs=None)[source]#

Bases: object

Pipeline to combine multiple data transformers, chaining them together.

Parameters:
  • transformers (Sequence[BaseDataTransformer]) – Sequence of data transformers.

  • copy (bool) – If set makes a (deep) copy of each data transformer before adding them to the pipeline

  • n_jobs (int | None) – The number of jobs to run in parallel. Parallel jobs are created only when a Sequence[TimeSeries] is passed as input to a method, parallelising operations regarding different TimeSeries. Defaults to 1 (sequential). Setting the parameter to -1 means using all the available processors. Note: for a small amount of data, the parallelisation overhead could end up increasing the total required amount of time. Note: this parameter will overwrite the value set in each single transformer. Leave this parameter set to None for keeping the original transformers’ configurations.

  • verbose (bool | None) – Whether to print progress of the operations. Note: this parameter will overwrite the value set in each single transformer. Leave this parameter set to None for keeping the transformers configurations.

Examples

>>> import numpy as np
>>> from darts import TimeSeries
>>> from darts.dataprocessing.transformers import Scaler, MissingValuesFiller
>>> from darts.dataprocessing.pipeline import Pipeline
>>> values = np.arange(start=0, stop=10, step=2.)
>>> values[1:3] = np.nan
>>> series = TimeSeries.from_values(values)
>>> pipeline = Pipeline([MissingValuesFiller(), Scaler()])
>>> series_transformed = pipeline.fit_transform(series)
>>> print(series_transformed.values())
[[0.  ]
 [0.25]
 [0.5 ]
 [0.75]
 [1.  ]]

Attributes

fittable

Returns whether the pipeline is fittable or not.

invertible

Returns whether the pipeline is invertible or not.

Methods

fit(data)

Fit all fittable transformers in pipeline.

fit_transform(data)

For each data transformer in the pipeline, first fit the data if transformer is fittable then transform data using fitted transformer.

inverse_transform(data[, partial, ...])

For each data transformer in the pipeline, inverse-transform data.

transform(data[, series_idx])

For each data transformer in pipeline transform data.

fit(data)[source]#

Fit all fittable transformers in pipeline.

Parameters:

data (Union[TimeSeries, Sequence[TimeSeries]]) – (Sequence of) TimeSeries to fit on.

fit_transform(data)[source]#

For each data transformer in the pipeline, first fit the data if transformer is fittable then transform data using fitted transformer. The transformed data is then passed to next transformer.

Parameters:

data (Union[TimeSeries, Sequence[TimeSeries]]) – (Sequence of) TimeSeries to fit and transform on.

Returns:

Transformed data.

Return type:

TimeSeriesLike

property fittable: bool#

Returns whether the pipeline is fittable or not. A pipeline is fittable if at least one of the transformers in the pipeline is fittable.

Returns:

True if the pipeline is fittable, False otherwise

Return type:

bool

inverse_transform(data, partial=False, series_idx=None, insample=None)[source]#

For each data transformer in the pipeline, inverse-transform data. Then inverse transformed data is passed to the next transformer. Transformers are traversed in reverse order. Raises value error if not all transformers are invertible and partial is set to False. Set partial to True for inverting only the InvertibleDataTransformer in the pipeline.

Parameters:
  • data (Union[TimeSeries, Sequence[TimeSeries], Sequence[Sequence[TimeSeries]]]) – (Sequence of) TimeSeries to inverse-transform.

  • partial (bool) – If set to True, the inverse transformation is applied even if the pipeline is not fully invertible, calling inverse_transform() only on transformers of type InvertibleDataTransformer.

  • series_idx (int | Sequence[int] | None) – Optionally, the index(es) of each series corresponding to their positions within the series used to fit the transformer (to retrieve the appropriate transformer parameters).

  • insample (Union[TimeSeries, Sequence[TimeSeries], None]) – Optionally, the transformed historic (insample) part of data. This can be used when data is only a tail (for example a forecast) and inverse transforming requires information from earlier times (for example the Diff transformer). Each insample series must start before the data start time and extend at least until one step before the start time of the data. If data is a Sequence[Sequence[TimeSeries]], then insample should be a Sequence[TimeSeries] with the same length. Otherwise, it should have the same type as data. Only used by transformers that require information from earlier times.

Returns:

Inverse-transformed data; same structure as data.

Return type:

TimeSeriesLike | Sequence[Sequence[TimeSeries]]

property invertible: bool#

Returns whether the pipeline is invertible or not. A pipeline is invertible if all transformers in the pipeline are themselves invertible.

Returns:

True if the pipeline is invertible, False otherwise

Return type:

bool

transform(data, series_idx=None)[source]#

For each data transformer in pipeline transform data. Then transformed data is passed to next transformer.

Parameters:
  • data (Union[TimeSeries, Sequence[TimeSeries]]) – (Sequence of) TimeSeries to be transformed.

  • series_idx (int | Sequence[int] | None) – Optionally, the index(es) of each series corresponding to their positions within the series used to fit the transformer (to retrieve the appropriate transformer parameters).

Returns:

Transformed data.

Return type:

TimeSeriesLike