Pipeline

class darts.dataprocessing.pipeline.Pipeline(transformers, copy=False, verbose=None, n_jobs=None)[source]

Bases: object

Pipeline to combine multiple data transformers, chaining them together.

Parameters
  • transformers (Sequence[BaseDataTransformer]) – Sequence of data transformers.

  • copy (bool) – If set makes a (deep) copy of each data transformer before adding them to the pipeline

  • n_jobs (Optional[int, None]) – The number of jobs to run in parallel. Parallel jobs are created only when a Sequence[TimeSeries] is passed as input to a method, parallelising operations regarding different TimeSeries. Defaults to 1 (sequential). Setting the parameter to -1 means using all the available processors. Note: for a small amount of data, the parallelisation overhead could end up increasing the total required amount of time. Note: this parameter will overwrite the value set in each single transformer. Leave this parameter set to None for keeping the original transformers’ configurations.

  • verbose (Optional[bool, None]) – Whether to print progress of the operations. Note: this parameter will overwrite the value set in each single transformer. Leave this parameter set to None for keeping the transformers configurations.

Examples

>>> import numpy as np
>>> from darts import TimeSeries
>>> from darts.datasets import AirPassengersDataset
>>> from darts.dataprocessing.transformers import Scaler, MissingValuesFiller
>>> from darts.dataprocessing.pipeline import Pipeline
>>> values = np.arange(start=0, stop=12.5, step=2.5)
>>> values[1:3] = np.nan
>>> series = series.from_values(values)
>>> pipeline = Pipeline([MissingValuesFiller(), Scaler()])
>>> series_transformed = pipeline.fit_transform(series)
<TimeSeries (DataArray) (time: 5, component: 1, sample: 1)>
array([[[0.  ]],
    [[0.25]],
    [[0.5 ]],
    [[0.75]],
    [[1.  ]]])
Coordinates:
* time       (time) int64 0 1 2 3 4
* component  (component) object '0'
Dimensions without coordinates: sample

Attributes

fittable

Returns whether the pipeline is fittable or not.

invertible

Returns whether the pipeline is invertible or not.

Methods

fit(data)

Fit all fittable transformers in pipeline.

fit_transform(data)

For each data transformer in the pipeline, first fit the data if transformer is fittable then transform data using fitted transformer.

inverse_transform(data[, partial, series_idx])

For each data transformer in the pipeline, inverse-transform data.

transform(data[, series_idx])

For each data transformer in pipeline transform data.

fit(data)[source]

Fit all fittable transformers in pipeline.

Parameters

data (Union[TimeSeries, Sequence[TimeSeries]]) – (Sequence of) TimeSeries to fit on.

fit_transform(data)[source]

For each data transformer in the pipeline, first fit the data if transformer is fittable then transform data using fitted transformer. The transformed data is then passed to next transformer.

Parameters

data (Union[TimeSeries, Sequence[TimeSeries]]) – (Sequence of) TimeSeries to fit and transform on.

Returns

Transformed data.

Return type

Union[TimeSeries, Sequence[TimeSeries]]

property fittable: bool

Returns whether the pipeline is fittable or not. A pipeline is fittable if at least one of the transformers in the pipeline is fittable.

Returns

True if the pipeline is fittable, False otherwise

Return type

bool

inverse_transform(data, partial=False, series_idx=None)[source]

For each data transformer in the pipeline, inverse-transform data. Then inverse transformed data is passed to the next transformer. Transformers are traversed in reverse order. Raises value error if not all of the transformers are invertible and partial is set to False. Set partial to True for inverting only the InvertibleDataTransformer in the pipeline.

Parameters
  • data (Union[TimeSeries, Sequence[TimeSeries]]) – (Sequence of) TimeSeries to be inverse transformed.

  • partial (bool) – If set to True, the inverse transformation is applied even if the pipeline is not fully invertible, calling inverse_transform() only on the `InvertibleDataTransformer`s

  • series_idx (Union[int, Sequence[int], None]) – Optionally, the index(es) of each series corresponding to their positions within the series used to fit the transformer (to retrieve the appropriate transformer parameters).

Returns

Inverse transformed data.

Return type

Union[TimeSeries, Sequence[TimeSeries]]

property invertible: bool

Returns whether the pipeline is invertible or not. A pipeline is invertible if all transformers in the pipeline are themselves invertible.

Returns

True if the pipeline is invertible, False otherwise

Return type

bool

transform(data, series_idx=None)[source]

For each data transformer in pipeline transform data. Then transformed data is passed to next transformer.

Parameters
  • data (Union[TimeSeries, Sequence[TimeSeries]]) – (Sequence of) TimeSeries to be transformed.

  • series_idx (Union[int, Sequence[int], None]) – Optionally, the index(es) of each series corresponding to their positions within the series used to fit the transformer (to retrieve the appropriate transformer parameters).

Returns

Transformed data.

Return type

Union[TimeSeries, Sequence[TimeSeries]]