Scaler¶
- class darts.dataprocessing.transformers.scaler.Scaler(scaler=None, name='Scaler', n_jobs=1, verbose=False)[source]¶
Bases:
darts.dataprocessing.transformers.invertible_data_transformer.InvertibleDataTransformer
,darts.dataprocessing.transformers.fittable_data_transformer.FittableDataTransformer
Generic wrapper class for using scalers on time series.
The underlying scaler has to implement the
fit()
,transform()
andinverse_transform()
methods (typically from scikit-learn).When the scaler is applied on multivariate series, the scaling is done per-component. When the series are stochastic, the scaling is done across all samples (for each given component). The transformation is applied independently for each dimension (component) of the time series, effectively merging all samples of a component in order to compute the transform.
Notes
The scaler will not scale the series’ static covariates. This has to be done either before constructing the series, or later on by extracting the covariates, transforming the values and then reapplying them to the series. For this, see TimeSeries properties TimeSeries.static_covariates and method TimeSeries.with_static_covariates()
- Parameters
scaler – The scaler to transform the data with. It must provide
fit()
,transform()
andinverse_transform()
methods. Default:sklearn.preprocessing.MinMaxScaler(feature_range=(0, 1))
; this will scale all the values of a time series between 0 and 1.name – A specific name for the scaler
n_jobs (
int
) – The number of jobs to run in parallel. Parallel jobs are created only when aSequence[TimeSeries]
is passed as input to a method, parallelising operations regarding differentTimeSeries
. Defaults to 1 (sequential). Setting the parameter to -1 means using all the available processors. Note: for a small amount of data, the parallelisation overhead could end up increasing the total required amount of time.verbose (
bool
) – Optionally, whether to print operations progress
Notes
In case the
Scaler
is applied to multipleTimeSeries
objects, a deep-copy of the chosen scaler will be instantiated, fitted, and stored, for eachTimeSeries
.Examples
>>> from darts.datasets import AirPassengersDataset >>> from sklearn.preprocessing import MinMaxScaler >>> from darts.dataprocessing.transformers import Scaler >>> series = AirPassengersDataset().load() >>> scaler = MinMaxScaler(feature_range=(-1, 1)) >>> transformer = Scaler(scaler) >>> series_transformed = transformer.fit_transform(series) >>> print(min(series_transformed.values())) [-1.] >>> print(max(series_transformed.values())) [2.]
Attributes
Name of the data transformer.
Methods
fit
(series, *args, **kwargs)Fit the transformer to the provided series or sequence of series.
fit_transform
(series, *args, **kwargs)Fit the transformer to the (sequence of) series and return the transformed input.
inverse_transform
(series, *args, **kwargs)Inverse-transform a (sequence of) series.
set_n_jobs
(value)Set the number of processors to be used by the transformer while processing multiple
TimeSeries
.set_verbose
(value)Set the verbosity status.
transform
(series, *args, **kwargs)Transform a (sequence of) of series.
ts_fit
(series, transformer, *args, **kwargs)The function that will be applied to each series when
fit()
is called.ts_inverse_transform
(series, transformer, ...)The function that will be applied to each series when
inverse_transform()
is called.ts_transform
(series, transformer, **kwargs)The function that will be applied to each series when
transform()
is called.- fit(series, *args, **kwargs)¶
Fit the transformer to the provided series or sequence of series.
Fit the data and store the fitting parameters into
self._fitted_params
. If a sequence is passed as input data, this function takes care of parallelising the fitting of multiple series in the sequence at the same time (in this caseself._fitted_params
will contain an array of fitted params, one for each series).- Parameters
series (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – (sequence of) series to fit the transformer on.args – Additional positional arguments for the
ts_fit()
methodkwargs –
Additional keyword arguments for the
ts_fit()
method- component_maskOptional[np.ndarray] = None
Optionally, a 1-D boolean np.ndarray of length
series.n_components
that specifies which components of the underlying series the Scaler should consider.
- Returns
Fitted transformer.
- Return type
- fit_transform(series, *args, **kwargs)¶
Fit the transformer to the (sequence of) series and return the transformed input.
- Parameters
series (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – the (sequence of) series to transform.args – Additional positional arguments for the
ts_transform()
methodkwargs –
Additional keyword arguments for the
ts_transform()
method:- component_maskOptional[np.ndarray] = None
Optionally, a 1-D boolean np.ndarray of length
series.n_components
that specifies which components of the underlying series the Scaler should consider.
- Returns
Transformed data.
- Return type
Union[TimeSeries, Sequence[TimeSeries]]
- inverse_transform(series, *args, **kwargs)¶
Inverse-transform a (sequence of) series.
In case a sequence is passed as input data, this function takes care of parallelising the transformation of multiple series in the sequence at the same time.
- Parameters
series (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – the (sequence of) series be inverse-transformed.args – Additional positional arguments for the
ts_inverse_transform()
methodkwargs –
Additional keyword arguments for the
ts_inverse_transform()
method- component_maskOptional[np.ndarray] = None
Optionally, a 1-D boolean np.ndarray of length
series.n_components
that specifies which components of the underlying series the Scaler should consider.
- Returns
Inverse transformed data.
- Return type
Union[TimeSeries, List[TimeSeries]]
- property name¶
Name of the data transformer.
- set_n_jobs(value)¶
Set the number of processors to be used by the transformer while processing multiple
TimeSeries
.- Parameters
value (
int
) – New n_jobs value. Set to -1 for using all the available cores.
- set_verbose(value)¶
Set the verbosity status.
True for enabling the detailed report about scaler’s operation progress, False for no additional information.
- Parameters
value (
bool
) – New verbosity status
- transform(series, *args, **kwargs)¶
Transform a (sequence of) of series.
In case a
Sequence
is passed as input data, this function takes care of parallelising the transformation of multiple series in the sequence at the same time.- Parameters
series (
Union
[TimeSeries
,Sequence
[TimeSeries
]]) – (sequence of) series to be transformed.args – Additional positional arguments for each
ts_transform()
method callkwargs – Additional keyword arguments for each
ts_transform()
method call
- Returns
Transformed data.
- Return type
Union[TimeSeries, List[TimeSeries]]
- static ts_fit(series, transformer, *args, **kwargs)[source]¶
The function that will be applied to each series when
fit()
is called.The function must take as first argument a
TimeSeries
object, and return an object containing information regarding the fitting phase (e.g., parameters, or external transformers objects). All these parameters will be stored inself._fitted_params
, which can be later used during the transformation step.This method is not implemented in the base class and must be implemented in the deriving classes.
If more parameters are added as input in the derived classes,
_fit_iterator()
should be redefined accordingly, to yield the necessary arguments to this function (See_fit_iterator()
for further details)- Parameters
(TimeSeries) (series) – TimeSeries against which the scaler will be fit.
Notes
This method is designed to be a static method instead of instance methods to allow an efficient parallelisation also when the scaler instance is storing a non-negligible amount of data. Using instance methods would imply copying the instance’s data through multiple processes, which can easily introduce a bottleneck and nullify parallelisation benefits.
- Return type
Any
- static ts_inverse_transform(series, transformer, *args, **kwargs)[source]¶
The function that will be applied to each series when
inverse_transform()
is called.The function must take as first argument a
TimeSeries
object, and return the transformedTimeSeries
object. Additional parameters can be added if necessary, but in this case,_inverse_transform_iterator()
should be redefined accordingly, to yield the necessary arguments to this function (See_inverse_transform_iterator()
for further details)This method is not implemented in the base class and must be implemented in the deriving classes.
- Parameters
(TimeSeries) (series) – TimeSeries which will be transformed.
Notes
This method is designed to be a static method instead of instance methods to allow an efficient parallelisation also when the scaler instance is storing a non-negligible amount of data. Using instance methods would imply copying the instance’s data through multiple processes, which can easily introduce a bottleneck and nullify parallelisation benefits.
- Return type
- static ts_transform(series, transformer, **kwargs)[source]¶
The function that will be applied to each series when
transform()
is called.The function must take as first argument a
TimeSeries
object, and return the transformedTimeSeries
object. If more parameters are added as input in the derived classes, the_transform_iterator()
should be redefined accordingly, to yield the necessary arguments to this function (See_transform_iterator()
for further details).This method is not implemented in the base class and must be implemented in the deriving classes.
- Parameters
series (
TimeSeries
) – series to be transformed.
Notes
This method is designed to be a static method instead of instance method to allow an efficient parallelisation also when the scaler instance is storing a non-negligible amount of data. Using instance methods would imply copying the instance’s data through multiple processes, which can easily introduce a bottleneck and nullify parallelisation benefits.
- Return type