Mapper and InvertibleMapper

class darts.dataprocessing.transformers.mappers.InvertibleMapper(fn, inverse_fn, name='InvertibleMapper', n_jobs=1, verbose=False)[source]

Bases: InvertibleDataTransformer

Data transformer to apply a custom function and its inverse to a (sequence of) TimeSeries (similar to calling TimeSeries.map() on each series).

Parameters
  • fn (Union[Callable[[number], number], Callable[[Timestamp, number], number]]) – Either a function which takes a value and returns a value ie. f(x) = y Or a function which takes a value and its timestamp and returns a value ie. f(timestamp, x) = y.

  • inverse_fn (Union[Callable[[number], number], Callable[[Timestamp, number], number]]) – Similarly to fn, either a function which takes a value and returns a value ie. f(x) = y Or a function which takes a value and its timestamp and returns a value ie. f(timestamp, x) = y. inverse_fn should be such that inverse_fn(fn(x)) == x.

  • name (str) – A specific name for the transformer.

  • n_jobs (int) – The number of jobs to run in parallel. Parallel jobs are created only when a Sequence[TimeSeries] is passed as input to a method, parallelising operations regarding different TimeSeries. Defaults to 1 (sequential). Setting the parameter to -1 means using all the available processors. Note: for a small amount of data, the parallelisation overhead could end up increasing the total required amount of time.

  • verbose (bool) – Optionally, whether to print operations progress

Examples

>>> import numpy as np
>>> from darts import TimeSeries
>>> from darts.dataprocessing.transformers import Mapper
>>> series = TimeSeries.from_values(np.array([1e0, 1e1, 1e2, 1e3]))
>>> transformer = Mapper(np.log10)
>>> series_transformed = transformer.transform(series)
>>> print(series_transformed)
<TimeSeries (DataArray) (time: 4, component: 1, sample: 1)>
array([[[0.]],
    [[1.]],
    [[2.]],
    [[3.]]])
Coordinates:
* time       (time) int64 0 1 2 3
* component  (component) <U1 '0'
Dimensions without coordinates: sample
>>> series_restaured = transformer.inverse_transform(series_transformed)
>>> print(series_restaured)
<TimeSeries (DataArray) (time: 4, component: 1, sample: 1)>
array([[[   1.]],
    [[  10.]],
    [[ 100.]],
    [[1000.]]])
Coordinates:
* time       (time) int64 0 1 2 3
* component  (component) <U1 '0'
Dimensions without coordinates: sample

Attributes

name

Name of the data transformer.

Methods

apply_component_mask(series[, ...])

Extracts components specified by component_mask from series

inverse_transform(series, *args[, ...])

Inverse transforms a (sequence of) series by calling the user-implemented ts_inverse_transform method.

set_n_jobs(value)

Set the number of processors to be used by the transformer while processing multiple TimeSeries.

set_verbose(value)

Set the verbosity status.

stack_samples(vals)

Creates an array of shape (n_timesteps * n_samples, n_components) from either a TimeSeries or the array_values of a TimeSeries.

transform(series, *args[, component_mask])

Transforms a (sequence of) of series by calling the user-implemeneted ts_transform method.

ts_inverse_transform(series, params)

The function that will be applied to each series when inverse_transform() is called.

ts_transform(series, params)

The function that will be applied to each series when transform() is called.

unapply_component_mask(series, vals[, ...])

Adds back components previously removed by component_mask in apply_component_mask method.

unstack_samples(vals[, n_timesteps, ...])

Reshapes the 2D array returned by stack_samples back into an array of shape (n_timesteps, n_components, n_samples); this 'undoes' the reshaping of stack_samples.

static apply_component_mask(series, component_mask=None, return_ts=False)

Extracts components specified by component_mask from series

Parameters
  • series (TimeSeries) – input TimeSeries to be fed into transformer.

  • component_mask (Optional[ndarray]) – Optionally, np.ndarray boolean mask of shape (n_components, 1) specifying which components to extract from series. The i`th component of `series is kept only if component_mask[i] = True. If not specified, no masking is performed.

  • return_ts – Optionally, specifies that a TimeSeries should be returned, rather than an np.ndarray.

Returns

TimeSeries (if return_ts = True) or np.ndarray (if return_ts = False) with only those components specified by component_mask remaining.

Return type

masked

inverse_transform(series, *args, component_mask=None, **kwargs)

Inverse transforms a (sequence of) series by calling the user-implemented ts_inverse_transform method.

In case a sequence is passed as input data, this function takes care of parallelising the transformation of multiple series in the sequence at the same time. Additionally, if the mask_components attribute was set to True when instantiating InvertibleDataTransformer, then any provided component_mask`s will be automatically applied to each input `TimeSeries; please refer to ‘Notes’ for further details on component masking.

Any additionally specified *args and **kwargs are automatically passed to ts_inverse_transform.

Parameters
  • series (Union[TimeSeries, Sequence[TimeSeries]]) – the (sequence of) series be inverse-transformed.

  • args – Additional positional arguments for the ts_inverse_transform() method

  • component_mask (Optional[np.ndarray] = None) – Optionally, a 1-D boolean np.ndarray of length series.n_components that specifies which components of the underlying series the inverse transform should consider.

  • kwargs – Additional keyword arguments for the ts_inverse_transform() method

Returns

Inverse transformed data.

Return type

Union[TimeSeries, List[TimeSeries]]

Notes

If the mask_components attribute was set to True when instantiating InvertibleDataTransformer, then any provided component_mask`s will be automatically applied to each `TimeSeries input to transform; component_mask`s are simply boolean arrays of shape `(series.n_components,) that specify which components of each series should be transformed using ts_inverse_transform and which components should not. If component_mask[i] is True, then the i`th component of each `series will be transformed by ts_inverse_transform. Conversely, if component_mask[i] is False, the i`th component will be removed from each `series before being passed to ts_inverse_transform; after transforming this masked series, the untransformed i`th component will be ‘added back’ to the output. Note that automatic `component_mask`ing can only be performed if the `ts_inverse_transform does not change the number of timesteps in each series; if this were to happen, then the transformed and untransformed components are unable to be concatenated back together along the component axis.

If mask_components was set to False when instantiating InvertibleDataTransformer, then any provided component_masks will be passed as a keyword argument ts_inverse_transform; the user can then manually specify how the component_mask should be applied to each series.

property name

Name of the data transformer.

set_n_jobs(value)

Set the number of processors to be used by the transformer while processing multiple TimeSeries.

Parameters

value (int) – New n_jobs value. Set to -1 for using all the available cores.

set_verbose(value)

Set the verbosity status.

True for enabling the detailed report about scaler’s operation progress, False for no additional information.

Parameters

value (bool) – New verbosity status

static stack_samples(vals)

Creates an array of shape (n_timesteps * n_samples, n_components) from either a TimeSeries or the array_values of a TimeSeries.

Each column of the returned array corresponds to a component (dimension) of the series and is formed by concatenating all of the samples associated with that component together. More specifically, the i`th column is formed by concatenating `[component_i_sample_1, component_i_sample_2, …, component_i_sample_n].

Stacking is useful when implementing a transformation that applies the exact same change to every timestep in the timeseries. In such cases, the samples of each component can be stacked together into a single column, and the transformation can then be applied to each column, thereby ‘vectorising’ the transformation over all samples of that component; the unstack_samples method can then be used to reshape the output. For transformations that depend on the time_index or the temporal ordering of the observations, stacking should not be employed.

Parameters

vals (Union[ndarray, TimeSeries]) – Timeseries or np.ndarray of shape (n_timesteps, n_components, n_samples) to be ‘stacked’.

Returns

np.ndarray of shape (n_timesteps * n_samples, n_components), where the i`th column is formed by concatenating all of the samples of the `i`th component in `vals.

Return type

stacked

transform(series, *args, component_mask=None, **kwargs)

Transforms a (sequence of) of series by calling the user-implemeneted ts_transform method.

In case a Sequence[TimeSeries] is passed as input data, this function takes care of parallelising the transformation of multiple series in the sequence at the same time. Additionally, if the mask_components attribute was set to True when instantiating BaseDataTransformer, then any provided component_mask`s will be automatically applied to each input `TimeSeries; please refer to ‘Notes’ for further details on component masking.

Any additionally specified *args and **kwargs are automatically passed to ts_transform.

Parameters
  • series (Union[TimeSeries, Sequence[TimeSeries]]) – (sequence of) series to be transformed.

  • args – Additional positional arguments for each ts_transform() method call

  • component_mask (Optional[np.ndarray] = None) – Optionally, a 1-D boolean np.ndarray of length series.n_components that specifies which components of the underlying series the transform should consider. If the mask_components attribute was set to True when instantiating BaseDataTransformer, then the component mask will be automatically applied to each TimeSeries input. Otherwise, component_mask will be provided as an addition keyword argument to ts_transform. See ‘Notes’ for further details.

  • kwargs – Additional keyword arguments for each ts_transform() method call

Returns

Transformed data.

Return type

Union[TimeSeries, List[TimeSeries]]

Notes

If the mask_components attribute was set to True when instantiating BaseDataTransformer, then any provided component_mask`s will be automatically applied to each `TimeSeries input to transform; component_mask`s are simply boolean arrays of shape `(series.n_components,) that specify which components of each series should be transformed using ts_transform and which components should not. If component_mask[i] is True, then the i`th component of each `series will be transformed by ts_transform. Conversely, if component_mask[i] is False, the i`th component will be removed from each `series before being passed to ts_transform; after transforming this masked series, the untransformed i`th component will be ‘added back’ to the output. Note that automatic `component_mask`ing can only be performed if the `ts_transform does not change the number of timesteps in each series; if this were to happen, then the transformed and untransformed components are unable to be concatenated back together along the component axis.

If mask_components was set to False when instantiating BaseDataTransformer, then any provided component_masks will be passed as a keyword argument ts_transform; the user can then manually specify how the component_mask should be applied to each series.

static ts_inverse_transform(series, params)[source]

The function that will be applied to each series when inverse_transform() is called.

The function must take as first argument a TimeSeries object and, as a second argument, a dictionary containing the fixed and/or fitted parameters of the transformation; this function should then return an inverse transformed TimeSeries object (i.e. ts_inverse_transform should ‘undo’ the transformation performed by ts_transform).

The params dictionary can contain up to two keys:

1. params[‘fixed’] stores the fixed parameters of the transformation (i.e. attributed defined in the __init__ method of the child-most class before super().__init__ is called); params[‘fixed’] is a dictionary itself, whose keys are the names of the fixed parameter attributes. For example, if _my_fixed_param is defined as an attribute in the child-most class, then this fixed parameter value can be accessed through params[‘fixed’][‘_my_fixed_param’]. 2. If the transform inherits from the FittableDataTransformer class, then params[‘fitted’] will store the fitted parameters of the transformation; the fitted parameters are simply the output(s) returned by the ts_fit function, whatever those output(s) may be. See FittableDataTransformer for further details about fitted parameters.

Any positional/keyword argument supplied to the transform method are passed as positional/keyword arguments to ts_inverse_transform; hence, ts_inverse_transform should also accept *args and/or **kwargs if positional/keyword arguments are passed to transform. Note that if the mask_components attribute of InvertibleDataTransformer is set to False, then the component_mask provided to transform will be passed as an additional keyword argument to ts_inverse_transform.

The BaseDataTransformer class, from which InvertibleDataTransformer inherits, includes some helper methods which may prove useful when implementing a ts_inverse_transform function:

1. The apply_component_mask and unapply_component_mask methods, which apply and ‘unapply’ component_mask`s to a `TimeSeries respectively; these methods are automatically called in transform if the mask_component attribute of InvertibleDataTransformer is set to True, but you may want to manually call them if you set mask_components to False and wish to manually specify how component_mask`s are applied to a `TimeSeries. 2. The stack_samples method, which stacks all the samples in a TimeSeries along the component axis, so that the TimeSeries goes from shape (n_timesteps, n_components, n_samples) to shape (n_timesteps, n_components * n_samples). This stacking is useful if a pointwise inverse transform is being implemented (i.e. transforming the value at time t depends only on the value of the series at that time t). Once transformed, the stacked TimeSeries can be ‘unstacked’ using the unstack_samples method.

This method is not implemented in the base class and must be implemented in the deriving classes.

Parameters
  • series (TimeSeries) – series to be transformed.

  • params (Mapping[str, Mapping[str, Union[Callable[[number], number], Callable[[Timestamp, number], number]]]]) – Dictionary containing the parameters of the transformation function. Fixed parameters (i.e. attributes defined in the child-most class of the transformation prior to calling super.__init__()) are stored under the ‘fixed’ key. If the transformation inherits from the FittableDataTransformer class, then the fitted parameters of the transformation (i.e. the values returned by ts_fit) are stored under the ‘fitted’ key.

  • args – Any additional keyword arguments provided to inverse_transform.

  • kwargs – Any additional keyword arguments provided to inverse_transform. Note that if the mask_component attribute of InvertibleDataTransformer is set to False, then component_mask will be passed as a keyword argument.

Notes

This method is designed to be a static method instead of instance methods to allow an efficient parallelisation also when the scaler instance is storing a non-negligible amount of data. Using instance methods would imply copying the instance’s data through multiple processes, which can easily introduce a bottleneck and nullify parallelisation benefits.

Return type

TimeSeries

static ts_transform(series, params)[source]

The function that will be applied to each series when transform() is called.

This method is not implemented in the base class and must be implemented in the deriving classes.

The function must take as first argument a TimeSeries object and, as a second argument, a dictionary containing the fixed and/or fitted parameters of the transformation; this function should then return a transformed TimeSeries object.

The params dictionary can contain up to two keys:

1. params[‘fixed’] stores the fixed parameters of the transformation (i.e. attributed defined in the __init__ method of the child-most class before super().__init__ is called); params[‘fixed’] is a dictionary itself, whose keys are the names of the fixed parameter attributes. For example, if _my_fixed_param is defined as an attribute in the child-most class, then this fixed parameter value can be accessed through params[‘fixed’][‘_my_fixed_param’]. 2. If the transform inherits from the FittableDataTransformer class, then params[‘fitted’] will store the fitted parameters of the transformation; the fitted parameters are simply the output(s) returned by the ts_fit function, whatever those output(s) may be. See FittableDataTransformer for further details about fitted parameters.

Any positional/keyword argument supplied to the transform method are passed as positional/keyword arguments to ts_transform; hence, ts_transform should also accept *args and/or **kwargs if positional/keyword arguments are passed to transform. Note that if the mask_components attribute of BaseDataTransformer is set to False, then the component_mask provided to transform will be passed as an additional keyword argument to ts_transform.

The BaseDataTransformer class includes some helper methods which may prove useful when implementing a ts_transform function:

1. The apply_component_mask and unapply_component_mask methods, which apply and ‘unapply’ component_mask`s to a `TimeSeries respectively; these methods are automatically called in transform if the mask_component attribute of BaseDataTransformer is set to True, but you may want to manually call them if you set mask_components to False and wish to manually specify how component_mask`s are applied to a `TimeSeries. 2. The stack_samples method, which stacks all the samples in a TimeSeries along the component axis, so that the TimeSeries goes from shape (n_timesteps, n_components, n_samples) to shape (n_timesteps, n_components * n_samples). This stacking is useful if a pointwise transform is being implemented (i.e. transforming the value at time t depends only on the value of the series at that time t). Once transformed, the stacked TimeSeries can be ‘unstacked’ using the unstack_samples method.

Parameters
  • series (TimeSeries) – series to be transformed.

  • params (Mapping[str, Mapping[str, Union[Callable[[number], number], Callable[[Timestamp, number], number]]]]) – Dictionary containing the parameters of the transformation function. Fixed parameters (i.e. attributes defined in the child-most class of the transformation prior to calling super.__init__()) are stored under the ‘fixed’ key. If the transformation inherits from the FittableDataTransformer class, then the fitted parameters of the transformation (i.e. the values returned by ts_fit) are stored under the ‘fitted’ key.

  • args – Any poisitional arguments provided in addition to series when

  • kwargs – Any additional keyword arguments provided to transform. Note that if the mask_component attribute of BaseDataTransformer is set to False, then component_mask will be passed as a keyword argument.

Notes

This method is designed to be a static method instead of instance method to allow an efficient parallelisation also when the scaler instance is storing a non-negligible amount of data. Using instance methods would imply copying the instance’s data through multiple processes, which can easily introduce a bottleneck and nullify parallelisation benefits.

Return type

TimeSeries

static unapply_component_mask(series, vals, component_mask=None)

Adds back components previously removed by component_mask in apply_component_mask method.

Parameters
  • series (TimeSeries) – input TimeSeries that was fed into transformer.

  • vals (Union[ndarray, TimeSeries]) – np.ndarray or TimeSeries to ‘unmask’

  • component_mask (Optional[ndarray]) – Optionally, np.ndarray boolean mask of shape (n_components, 1) specifying which components were extracted from series. If given, insert vals back into the columns of the original array. If not specified, nothing is ‘unmasked’.

Returns

TimeSeries (if vals is a TimeSeries) or np.ndarray (if vals is an np.ndarray) with those components previously removed by component_mask now ‘added back’.

Return type

unmasked

static unstack_samples(vals, n_timesteps=None, n_samples=None, series=None)

Reshapes the 2D array returned by stack_samples back into an array of shape (n_timesteps, n_components, n_samples); this ‘undoes’ the reshaping of stack_samples. Either n_components, n_samples, or series must be specified.

Parameters
  • vals (ndarray) – np.ndarray of shape (n_timesteps * n_samples, n_components) to be ‘unstacked’.

  • n_timesteps (Optional[int]) – Optionally, the number of timesteps in the array originally passed to stack_samples. Does not need to be provided if series is specified.

  • n_samples (Optional[int]) – Optionally, the number of samples in the array originally passed to stack_samples. Does not need to be provided if series is specified.

  • series (Optional[TimeSeries]) – Optionally, the TimeSeries object used to create vals; n_samples is inferred from this.

Returns

np.ndarray of shape (n_timesteps, n_components, n_samples).

Return type

unstacked

class darts.dataprocessing.transformers.mappers.Mapper(fn, name='Mapper', n_jobs=1, verbose=False)[source]

Bases: BaseDataTransformer

Data transformer to apply a custom function to a (sequence of) TimeSeries (similar to calling TimeSeries.map() on each series).

The mapper takes care of parallelizing the operations on multiple series over multiple processors.

Parameters
  • fn (Union[Callable[[number], number], Callable[[Timestamp, number], number]]) – Either a function which takes a value and returns a value ie. f(x) = y Or a function which takes a value and its timestamp and returns a value ie. f(timestamp, x) = y.

  • name (str) – A specific name for the transformer.

  • n_jobs (int) – The number of jobs to run in parallel. Parallel jobs are created only when a Sequence[TimeSeries] is passed as input to a method, parallelising operations regarding different TimeSeries. Defaults to 1 (sequential). Setting the parameter to -1 means using all the available processors. Note: for a small amount of data, the parallelisation overhead could end up increasing the total required amount of time.

  • verbose (bool) – Optionally, whether to print operations progress

Examples

>>> import numpy as np
>>> from darts import TimeSeries
>>> from darts.dataprocessing.transformers import InvertibleMapper
>>> series = TimeSeries.from_values(np.array([1e0, 1e1, 1e2, 1e3]))
>>> transformer = InvertibleMapper(np.log10, lambda x: 10**x)
>>> series_transformed = transformer.transform(series)
>>> print(series_transformed)
<TimeSeries (DataArray) (time: 4, component: 1, sample: 1)>
array([[[0.]],
    [[1.]],
    [[2.]],
    [[3.]]])
Coordinates:
* time       (time) int64 0 1 2 3
* component  (component) <U1 '0'
Dimensions without coordinates: sample

Attributes

name

Name of the data transformer.

Methods

apply_component_mask(series[, ...])

Extracts components specified by component_mask from series

set_n_jobs(value)

Set the number of processors to be used by the transformer while processing multiple TimeSeries.

set_verbose(value)

Set the verbosity status.

stack_samples(vals)

Creates an array of shape (n_timesteps * n_samples, n_components) from either a TimeSeries or the array_values of a TimeSeries.

transform(series, *args[, component_mask])

Transforms a (sequence of) of series by calling the user-implemeneted ts_transform method.

ts_transform(series, params)

The function that will be applied to each series when transform() is called.

unapply_component_mask(series, vals[, ...])

Adds back components previously removed by component_mask in apply_component_mask method.

unstack_samples(vals[, n_timesteps, ...])

Reshapes the 2D array returned by stack_samples back into an array of shape (n_timesteps, n_components, n_samples); this 'undoes' the reshaping of stack_samples.

static apply_component_mask(series, component_mask=None, return_ts=False)

Extracts components specified by component_mask from series

Parameters
  • series (TimeSeries) – input TimeSeries to be fed into transformer.

  • component_mask (Optional[ndarray]) – Optionally, np.ndarray boolean mask of shape (n_components, 1) specifying which components to extract from series. The i`th component of `series is kept only if component_mask[i] = True. If not specified, no masking is performed.

  • return_ts – Optionally, specifies that a TimeSeries should be returned, rather than an np.ndarray.

Returns

TimeSeries (if return_ts = True) or np.ndarray (if return_ts = False) with only those components specified by component_mask remaining.

Return type

masked

property name

Name of the data transformer.

set_n_jobs(value)

Set the number of processors to be used by the transformer while processing multiple TimeSeries.

Parameters

value (int) – New n_jobs value. Set to -1 for using all the available cores.

set_verbose(value)

Set the verbosity status.

True for enabling the detailed report about scaler’s operation progress, False for no additional information.

Parameters

value (bool) – New verbosity status

static stack_samples(vals)

Creates an array of shape (n_timesteps * n_samples, n_components) from either a TimeSeries or the array_values of a TimeSeries.

Each column of the returned array corresponds to a component (dimension) of the series and is formed by concatenating all of the samples associated with that component together. More specifically, the i`th column is formed by concatenating `[component_i_sample_1, component_i_sample_2, …, component_i_sample_n].

Stacking is useful when implementing a transformation that applies the exact same change to every timestep in the timeseries. In such cases, the samples of each component can be stacked together into a single column, and the transformation can then be applied to each column, thereby ‘vectorising’ the transformation over all samples of that component; the unstack_samples method can then be used to reshape the output. For transformations that depend on the time_index or the temporal ordering of the observations, stacking should not be employed.

Parameters

vals (Union[ndarray, TimeSeries]) – Timeseries or np.ndarray of shape (n_timesteps, n_components, n_samples) to be ‘stacked’.

Returns

np.ndarray of shape (n_timesteps * n_samples, n_components), where the i`th column is formed by concatenating all of the samples of the `i`th component in `vals.

Return type

stacked

transform(series, *args, component_mask=None, **kwargs)

Transforms a (sequence of) of series by calling the user-implemeneted ts_transform method.

In case a Sequence[TimeSeries] is passed as input data, this function takes care of parallelising the transformation of multiple series in the sequence at the same time. Additionally, if the mask_components attribute was set to True when instantiating BaseDataTransformer, then any provided component_mask`s will be automatically applied to each input `TimeSeries; please refer to ‘Notes’ for further details on component masking.

Any additionally specified *args and **kwargs are automatically passed to ts_transform.

Parameters
  • series (Union[TimeSeries, Sequence[TimeSeries]]) – (sequence of) series to be transformed.

  • args – Additional positional arguments for each ts_transform() method call

  • component_mask (Optional[np.ndarray] = None) – Optionally, a 1-D boolean np.ndarray of length series.n_components that specifies which components of the underlying series the transform should consider. If the mask_components attribute was set to True when instantiating BaseDataTransformer, then the component mask will be automatically applied to each TimeSeries input. Otherwise, component_mask will be provided as an addition keyword argument to ts_transform. See ‘Notes’ for further details.

  • kwargs – Additional keyword arguments for each ts_transform() method call

Returns

Transformed data.

Return type

Union[TimeSeries, List[TimeSeries]]

Notes

If the mask_components attribute was set to True when instantiating BaseDataTransformer, then any provided component_mask`s will be automatically applied to each `TimeSeries input to transform; component_mask`s are simply boolean arrays of shape `(series.n_components,) that specify which components of each series should be transformed using ts_transform and which components should not. If component_mask[i] is True, then the i`th component of each `series will be transformed by ts_transform. Conversely, if component_mask[i] is False, the i`th component will be removed from each `series before being passed to ts_transform; after transforming this masked series, the untransformed i`th component will be ‘added back’ to the output. Note that automatic `component_mask`ing can only be performed if the `ts_transform does not change the number of timesteps in each series; if this were to happen, then the transformed and untransformed components are unable to be concatenated back together along the component axis.

If mask_components was set to False when instantiating BaseDataTransformer, then any provided component_masks will be passed as a keyword argument ts_transform; the user can then manually specify how the component_mask should be applied to each series.

static ts_transform(series, params)[source]

The function that will be applied to each series when transform() is called.

This method is not implemented in the base class and must be implemented in the deriving classes.

The function must take as first argument a TimeSeries object and, as a second argument, a dictionary containing the fixed and/or fitted parameters of the transformation; this function should then return a transformed TimeSeries object.

The params dictionary can contain up to two keys:

1. params[‘fixed’] stores the fixed parameters of the transformation (i.e. attributed defined in the __init__ method of the child-most class before super().__init__ is called); params[‘fixed’] is a dictionary itself, whose keys are the names of the fixed parameter attributes. For example, if _my_fixed_param is defined as an attribute in the child-most class, then this fixed parameter value can be accessed through params[‘fixed’][‘_my_fixed_param’]. 2. If the transform inherits from the FittableDataTransformer class, then params[‘fitted’] will store the fitted parameters of the transformation; the fitted parameters are simply the output(s) returned by the ts_fit function, whatever those output(s) may be. See FittableDataTransformer for further details about fitted parameters.

Any positional/keyword argument supplied to the transform method are passed as positional/keyword arguments to ts_transform; hence, ts_transform should also accept *args and/or **kwargs if positional/keyword arguments are passed to transform. Note that if the mask_components attribute of BaseDataTransformer is set to False, then the component_mask provided to transform will be passed as an additional keyword argument to ts_transform.

The BaseDataTransformer class includes some helper methods which may prove useful when implementing a ts_transform function:

1. The apply_component_mask and unapply_component_mask methods, which apply and ‘unapply’ component_mask`s to a `TimeSeries respectively; these methods are automatically called in transform if the mask_component attribute of BaseDataTransformer is set to True, but you may want to manually call them if you set mask_components to False and wish to manually specify how component_mask`s are applied to a `TimeSeries. 2. The stack_samples method, which stacks all the samples in a TimeSeries along the component axis, so that the TimeSeries goes from shape (n_timesteps, n_components, n_samples) to shape (n_timesteps, n_components * n_samples). This stacking is useful if a pointwise transform is being implemented (i.e. transforming the value at time t depends only on the value of the series at that time t). Once transformed, the stacked TimeSeries can be ‘unstacked’ using the unstack_samples method.

Parameters
  • series (TimeSeries) – series to be transformed.

  • params (Mapping[str, Any]) – Dictionary containing the parameters of the transformation function. Fixed parameters (i.e. attributes defined in the child-most class of the transformation prior to calling super.__init__()) are stored under the ‘fixed’ key. If the transformation inherits from the FittableDataTransformer class, then the fitted parameters of the transformation (i.e. the values returned by ts_fit) are stored under the ‘fitted’ key.

  • args – Any poisitional arguments provided in addition to series when

  • kwargs – Any additional keyword arguments provided to transform. Note that if the mask_component attribute of BaseDataTransformer is set to False, then component_mask will be passed as a keyword argument.

Notes

This method is designed to be a static method instead of instance method to allow an efficient parallelisation also when the scaler instance is storing a non-negligible amount of data. Using instance methods would imply copying the instance’s data through multiple processes, which can easily introduce a bottleneck and nullify parallelisation benefits.

Return type

TimeSeries

static unapply_component_mask(series, vals, component_mask=None)

Adds back components previously removed by component_mask in apply_component_mask method.

Parameters
  • series (TimeSeries) – input TimeSeries that was fed into transformer.

  • vals (Union[ndarray, TimeSeries]) – np.ndarray or TimeSeries to ‘unmask’

  • component_mask (Optional[ndarray]) – Optionally, np.ndarray boolean mask of shape (n_components, 1) specifying which components were extracted from series. If given, insert vals back into the columns of the original array. If not specified, nothing is ‘unmasked’.

Returns

TimeSeries (if vals is a TimeSeries) or np.ndarray (if vals is an np.ndarray) with those components previously removed by component_mask now ‘added back’.

Return type

unmasked

static unstack_samples(vals, n_timesteps=None, n_samples=None, series=None)

Reshapes the 2D array returned by stack_samples back into an array of shape (n_timesteps, n_components, n_samples); this ‘undoes’ the reshaping of stack_samples. Either n_components, n_samples, or series must be specified.

Parameters
  • vals (ndarray) – np.ndarray of shape (n_timesteps * n_samples, n_components) to be ‘unstacked’.

  • n_timesteps (Optional[int]) – Optionally, the number of timesteps in the array originally passed to stack_samples. Does not need to be provided if series is specified.

  • n_samples (Optional[int]) – Optionally, the number of samples in the array originally passed to stack_samples. Does not need to be provided if series is specified.

  • series (Optional[TimeSeries]) – Optionally, the TimeSeries object used to create vals; n_samples is inferred from this.

Returns

np.ndarray of shape (n_timesteps, n_components, n_samples).

Return type

unstacked