Timeseries

TimeSeries is Darts container for storing and handling time series data. It supports univariate or multivariate time series that can be deterministic or stochastic.

The values are stored in an array of shape (times, components, samples), where times are the number of time steps, components are the number of columns, and samples are the number of samples in the series.

Definitions:

  • A series with components = 1 is univariate, and a series with components > 1 is multivariate.

  • A series with samples = 1 is deterministic and a series with samples > 1 is stochastic (or probabilistic).

Each series also stores a time_index, which contains either datetimes (pandas.DateTimeIndex) or integer indices (pandas.RangeIndex).

Optionally, TimeSeries can store static covariates, a hierarchy, and / or metadata.

  • Static covariates are time-invariant external data / information about the series and can be used by some models to help improve predictions. Find more info on covariates here.

  • A hierarchy describes the hierarchical structure of the components which can be used to reconcile forecasts. For more info on hierarchical reconciliation here.

  • Metadata can be used to store any additional information about the series which will not be used by any model.

TimeSeries are guaranteed to:

  • Have a strictly monotonically increasing time index with a well-defined frequency (without holes / missing dates). For more info on available DateTimeIndex frequencies, see date offset aliases. For integer-indexed series the frequency corresponds to the constant step size between consecutive indices.

  • Contain numeric data types only

  • Have unique component / column names

  • Have static covariates consistent with their components (global or component-specific), or no static covariates

  • Have a hierarchy consistent with their components, or no hierarchy

class darts.timeseries.TimeSeries(times, values, fill_missing_dates=False, freq=None, components=None, fillna_value=None, static_covariates=None, hierarchy=None, metadata=None, copy=True)[source]

Bases: object

Create a TimeSeries from a time index times and values values.

See also

TimeSeries.from_dataframe

Create from a DataFrame (pandas.DataFrame, polars.DataFrame, and other backends).

TimeSeries.from_group_dataframe

Create multiple TimeSeries by groups from a pandas.DataFrame.

TimeSeries.from_series

Create from a Series (pandas.Series, polars.Series, and other backends).

TimeSeries.from_values

Create from a numpy.ndarray.

TimeSeries.from_times_and_values

Create from a time index and a numpy.ndarray.

TimeSeries.from_csv

Create from a CSV file.

TimeSeries.from_json

Create from a JSON file.

TimeSeries.from_xarray

Create from an xarray.DataArray.

Parameters
  • times (Union[DatetimeIndex, RangeIndex, Index]) – A pandas DateTimeIndex, RangeIndex, or Index that can be converted to a RangeIndex representing the time axis for the time series. It is better if the index has no holes; alternatively setting fill_missing_dates can in some cases solve these issues (filling holes with NaN, or with the provided fillna_value numeric value, if any).

  • values (ndarray) – A Numpy array of values for the TimeSeries. Both 2-dimensional arrays, for deterministic series, and 3-dimensional arrays, for probabilistic series, are accepted. In the former case the dimensions should be (time, component), and in the latter case (time, component, sample).

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) – Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • components (Union[ForwardRef, ndarray, ForwardRef, ForwardRef, SequenceNotStr, range, None]) – Optionally, some column names to use for the second values dimension.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • static_covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series or a pandas DataFrame. If a Series, the index represents the static variables. The covariates are globally ‘applied’ to all components of the TimeSeries. If a DataFrame, the columns represent the static variables and the rows represent the components of the uni/multivariate TimeSeries. If a single-row DataFrame, the covariates are globally ‘applied’ to all components of the TimeSeries. If a multi-row DataFrame, the number of rows must match the number of components of the TimeSeries (in this case, the number of columns in values). This adds control for component-specific static covariates.

  • hierarchy (Optional[dict, None]) –

    Optionally, a dictionary describing the grouping(s) of the time series. The keys are component names, and for a given component name c, the value is a list of component names that c “belongs” to. For instance, if there is a total component, split both in two divisions d1 and d2 and in two regions r1 and r2, and four products d1r1 (in division d1 and region r1), d2r1, d1r2 and d2r2, the hierarchy would be encoded as follows.

    hierarchy={
        "d1r1": ["d1", "r1"],
        "d1r2": ["d1", "r2"],
        "d2r1": ["d2", "r1"],
        "d2r2": ["d2", "r2"],
        "d1": ["total"],
        "d2": ["total"],
        "r1": ["total"],
        "r2": ["total"]
    }
    

    The hierarchy can be used to reconcile forecasts (so that the sums of the forecasts at different levels are consistent), see hierarchical reconciliation.

  • metadata (Optional[dict, None]) – Optionally, a dictionary with metadata to be added to the TimeSeries.

  • copy (bool) – Whether to copy the times and values objects. If copy=False, mutating the series data will affect the original data. Additionally, if times lack a frequency or step size, it will be assigned to the original object.

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> import numpy as np
>>> from darts import TimeSeries
>>> from darts.utils.utils import generate_index
>>> # create values and times with daily frequency
>>> vals, times = np.arange(3), generate_index("2020-01-01", length=3, freq="D")
>>> series = TimeSeries(times=times, values=vals)
>>> series.shape
(3, 1, 1)

Attributes

bottom_level_components

The bottom level component names of this series, or None if the series has no hierarchy.

bottom_level_series

The series containing the bottom-level components of this series in the same order as they appear in the series, or None if the series has no hierarchy.

columns

The component (column) names of the series, as a pandas.Index.

components

The component (column) names of the series, as a pandas.Index.

dtype

The dtype of the series' values.

duration

The duration of the series (as a pandas.Timedelta or int).

freq

The frequency of the series.

freq_str

The string representation of the series' frequency.

has_datetime_index

Whether the series is indexed with a pandas.DatetimeIndex (otherwise it is indexed with an pandas.RangeIndex).

has_hierarchy

Whether the series contains a hierarchy.

has_metadata

Whether the series contains metadata.

has_range_index

Whether the series is indexed with an pandas.RangeIndex (otherwise it is indexed with a pandas.DatetimeIndex).

has_static_covariates

Whether the series contains static covariates.

hierarchy

The hierarchy of this series.

is_deterministic

Whether the series is deterministic.

is_probabilistic

Whether the series is stochastic (probabilistic).

is_stochastic

Whether the series is stochastic (probabilistic).

is_univariate

Whether the series is univariate.

metadata

The metadata of this series.

n_components

The number of components (columns) in the series.

n_samples

The number of samples contained in the series.

n_timesteps

The number of time steps in the series.

shape

The shape of the series (n_timesteps, n_components, n_samples).

static_covariates

The static covariates of this series.

time_dim

The time dimension name of the series.

time_index

The time index of the series.

top_level_component

The top level component name of this series, or None if the series has no hierarchy.

top_level_series

The univariate series containing the single top-level component of this series, or None if the series has no hierarchy.

width

The width (number of components) of the series.

Methods

add_datetime_attribute(attribute[, one_hot, ...])

Return a new series with one (or more) additional component(s) that contain an attribute of the series' time index.

add_holidays(country_code[, prov, state, tz])

Return a new series with an added holiday component.

all_values([copy])

Return a 3-D array of dimension (time, component, sample) containing the series' values for all samples.

append(other)

Return a new series with the other series appended to this series along the time axis (added to the end).

append_values(values)

Return a new series with values appended to this series along the time axis (added to the end).

astype(dtype)

Return a new series with the values have been converted to the desired dtype.

concatenate(other[, axis, ignore_time_axis, ...])

Return a new series where this series is concatenated with the other series along a given axis.

copy()

Create a copy of the series.

cumsum()

Return a new series with the cumulative sum along the time axis.

data_array([copy])

Return an xarray.DataArray representation of the series.

diff([n, periods, dropna])

Return a new series with differenced values.

drop_after(split_point)

Return a new series where everything after and including the provided time split_point was dropped.

drop_before(split_point)

Return a new series where everything before and including the provided time split_point was dropped.

drop_columns(col_names)

Return a new series with dropped components (columns).

end_time()

End time of the series.

first_value()

First value of the univariate series.

first_values()

First values of the potentially multivariate series.

from_csv(filepath_or_buffer[, time_col, ...])

Create a TimeSeries from a CSV file.

from_dataframe(df[, time_col, value_cols, ...])

Create a TimeSeries from a selection of columns of a DataFrame.

from_group_dataframe(df, group_cols[, ...])

Create a list of TimeSeries grouped by a selection of columns from a DataFrame.

from_json(json_str[, static_covariates, ...])

Create a TimeSeries from the JSON String representation of a TimeSeries.

from_pickle(path)

Read a pickled TimeSeries.

from_series(pd_series[, fill_missing_dates, ...])

Create a TimeSeries from a Series.

from_times_and_values(times, values[, ...])

Create a TimeSeries from a time index and value array.

from_values(values[, columns, fillna_value, ...])

Create an TimeSeries from an array of values.

from_xarray(xa[, fill_missing_dates, freq, ...])

Create a TimeSeries from an xarray.DataArray.

gaps([mode])

Compute and return gaps in the series.

get_index_at_point(point[, after])

Convert a point along the time index into an integer index ranging from (0, len(series)-1) inclusive.

get_timestamp_at_point(point)

Convert a point into a pandas.Timestamp (if datetime-indexed) or integer (if integer-indexed).

has_same_time_as(other)

Whether the series has the same time index as the other series.

head([size, axis])

Return a new series with the first size points.

is_within_range(ts)

Whether the given timestamp or integer is within the time interval of the series.

kurtosis(**kwargs)

Return a deterministic series with the kurtosis of each component computed over the samples of the stochastic series.

last_value()

Last value of the univariate series.

last_values()

Last values of the potentially multivariate series.

longest_contiguous_slice([max_gap_size, mode])

Return the largest slice of the deterministic series without any gaps (contiguous all-NaN value entries) larger than max_gap_size.

map(fn)

Return a new series with the function fn applied to the values of this series.

max([axis])

Return a new series with the maximum computed over the specified axis.

mean([axis])

Return a new series with the mean computed over the specified axis.

median([axis])

Return a new series with the median computed over the specified axis.

min([axis])

Return a new series with the minimum computed over the specified axis.

plot([new_plot, central_quantile, ...])

Plot the series.

prepend(other)

Return a new series with the other series prepended to this series along the time axis (added to the beginning).

prepend_values(values)

Return a new series with values prepended to this series along the time axis (added to the beginning).

quantile([q])

Return a deterministic series with the desired quantile(s) q of each component computed over the samples of the stochastic series.

random_component_values([copy])

Return a 2-D array of shape (time, component), containing the series' values for one sample taken uniformly at random from all samples.

resample(freq[, method, method_kwargs])

Return a new series where the time index and values were resampled with a given frequency.

rescale_with_value(value_at_first_step)

Return a new series, which is a multiple of this series such that the first value is value_at_first_step.

schema([copy])

Return the schema of the series as a dictionary.

shift(n)

Return a new series where the time index was shifted by n steps.

skew(**kwargs)

Return a deterministic series with the skew of each component computed over the samples of the stochastic series.

slice(start_ts, end_ts)

Return a slice of the series starting at start_ts and ending before end_ts.

slice_intersect(other)

Return a slice of the series where the time index was intersected with the other series.

slice_intersect_times(other[, copy])

Return the time index of the series where the time index was intersected with the other series.

slice_intersect_values(other[, copy])

Return the sliced values of the series where the time index was intersected with the other series.

slice_n_points_after(start_ts, n)

Return a slice of the series starting at start_ts (inclusive) and having at most n points.

slice_n_points_before(end_ts, n)

Return a slice of the series ending at end_ts (inclusive) and having at most n points.

split_after(split_point)

Split the series in two, after a provided split_point.

split_before(split_point)

Split the series in two, before a provided split_point.

stack(other)

Return a new series with the other series stacked to this series along the component axis.

start_time()

Start time of the series.

static_covariates_values([copy])

Return a 2-D array of dimension (component, static variable) containing the series' static covariate values.

std([ddof])

Return a deterministic series with the standard deviation of each component computed over the samples of the stochastic series.

strip([how])

Return a slice of the deterministic time series where NaN-containing entries at the beginning and the end were removed.

sum([axis])

Return a new series with the sum computed over the specified axis.

tail([size, axis])

Return a new series with the last size points.

to_csv(*args, **kwargs)

Write the deterministic series to a CSV file.

to_dataframe([copy, backend, time_as_index, ...])

Return a DataFrame representation of the series in a given backend.

to_json()

Return a JSON string representation of the deterministic series.

to_pickle(path[, protocol])

Save the series in pickle format.

to_series([copy, backend])

Return a Series representation of the series in a given backend.

univariate_component(index)

Return a new univariate series with a selected component.

univariate_values([copy, sample])

Return a 1-D Numpy array of shape (time,) containing the univariate series' values for one sample.

values([copy, sample])

Return a 2-D array of shape (time, component), containing the series' values for one sample.

var([ddof])

Return a deterministic series with the variance of each component computed over the samples of the stochastic series.

window_transform(transforms[, treat_na, ...])

Return a new series with the specified window transformations applied.

with_columns_renamed(col_names, col_names_new)

Return a new series with new columns/components names.

with_hierarchy(hierarchy)

Return a new series with added hierarchy.

with_metadata(metadata)

Return a new series with added metadata.

with_static_covariates(covariates)

Return a new series with added static covariates.

with_times_and_values(times, values[, ...])

Return a new series similar to this one but with new times and values.

with_values(values)

Return a new series similar to this one but with new values.

add_datetime_attribute(attribute, one_hot=False, cyclic=False, tz=None)[source]

Return a new series with one (or more) additional component(s) that contain an attribute of the series’ time index.

The additional components are specified with attribute, such as ‘weekday’, ‘day’ or ‘month’.

This works only for deterministic time series (i.e., made of 1 sample).

Notes

0-indexing is enforced across all the encodings, see datetime_attribute_timeseries() for more information.

Parameters
  • attribute – A pandas.DatatimeIndex attribute which will serve as the basis of the new column(s).

  • one_hot (bool) – Boolean value indicating whether to add the specified attribute as a one hot encoding (results in more columns).

  • cyclic (bool) – Boolean value indicating whether to add the specified attribute as a cyclic encoding. Alternative to one_hot encoding, enable only one of the two. (adds 2 columns, corresponding to sin and cos transformation).

  • tz (Optional[str, None]) – Optionally, a time zone to convert the time index to before computing the attributes.

Returns

A new series with an added datetime attribute component(s).

Return type

TimeSeries

add_holidays(country_code, prov=None, state=None, tz=None)[source]

Return a new series with an added holiday component.

The holiday component is binary where 1 corresponds to a time step falling on a holiday.

Available countries can be found here.

This works only for deterministic time series (i.e., made of 1 sample).

Parameters
  • country_code (str) – The country ISO code

  • prov (Optional[str, None]) – The province

  • state (Optional[str, None]) – The state

  • tz (Optional[str, None]) – Optionally, a time zone to convert the time index to before computing the attributes.

Returns

A new series with an added holiday component.

Return type

TimeSeries

all_values(copy=True)[source]

Return a 3-D array of dimension (time, component, sample) containing the series’ values for all samples.

Parameters

copy (bool) – Whether to return a copy of the values, otherwise returns a view. Leave it to True unless you know what you are doing.

Returns

The values composing the time series.

Return type

numpy.ndarray

append(other)[source]

Return a new series with the other series appended to this series along the time axis (added to the end).

Parameters

other (Self) – A second TimeSeries.

Returns

A new series, obtained by appending the second series to the first.

Return type

TimeSeries

See also

TimeSeries.concatenate

concatenate another series along a given axis.

TimeSeries.prepend

prepend another series along the time axis.

append_values(values)[source]

Return a new series with values appended to this series along the time axis (added to the end).

This adds time steps to the end of the new series.

Parameters

values (ndarray) – An array with the values to append.

Returns

A new series with the new values appended.

Return type

TimeSeries

See also

TimeSeries.prepend_values

prepend the values of another series along the time axis.

astype(dtype)[source]

Return a new series with the values have been converted to the desired dtype.

Parameters

dtype (Union[str, dtype]) – A NumPy dtype (numpy.float32 or numpy.float64)

Returns

A series having the desired dtype.

Return type

TimeSeries

property bottom_level_components: Optional[list[str]]

The bottom level component names of this series, or None if the series has no hierarchy.

Return type

Optional[list[str], None]

property bottom_level_series: Optional[list[typing_extensions.Self]]

The series containing the bottom-level components of this series in the same order as they appear in the series, or None if the series has no hierarchy.

Return type

Optional[list[Self], None]

property columns: Index

The component (column) names of the series, as a pandas.Index.

Return type

Index

property components: Index

The component (column) names of the series, as a pandas.Index.

Return type

Index

concatenate(other, axis=0, ignore_time_axis=False, ignore_static_covariates=False, drop_hierarchy=True, drop_metadata=False)[source]

Return a new series where this series is concatenated with the other series along a given axis.

Parameters
  • other (TimeSeries) – another timeseries to concatenate to this one

  • axis (str or int) – axis along which timeseries will be concatenated. [‘time’, ‘component’ or ‘sample’; Default: 0 (time)]

  • ignore_time_axis (bool, default False) – Ignore errors when time axis varies for some timeseries. Note that this may yield unexpected results

  • ignore_static_covariates (bool) – whether to ignore all requirements for static covariate concatenation and only transfer the static covariates of the current (self) timeseries to the concatenated timeseries. Only effective when axis=1.

  • drop_hierarchy (bool) – When axis=1, whether to drop hierarchy information. True by default. When False, the hierarchies will be “concatenated” as well (by merging the hierarchy dictionaries), which may cause issues if the component names of the resulting series and that of the merged hierarchy do not match. When axis=0 or axis=2, the hierarchy of the first series is always kept.

  • drop_metadata (bool) – Whether to drop the metadata information of the concatenated timeseries. False by default. When False, the concatenated series will inherit the metadata from the current (self) timeseries.

Returns

The concatenated series.

Return type

TimeSeries

See also

concatenate

a function to concatenate multiple series along a given axis.

Notes

When concatenating along the time dimension, the current series marks the start date of the resulting series, and the other series will have its time index ignored.

copy()[source]

Create a copy of the series.

Returns

A copy of the series.

Return type

TimeSeries

cumsum()[source]

Return a new series with the cumulative sum along the time axis.

Returns

A new series, with the cumulatively summed values.

Return type

TimeSeries

data_array(copy=True)[source]

Return an xarray.DataArray representation of the series.

Parameters

copy (bool) – Whether to return a copy of the series. Leave it to True unless you know what you are doing.

Returns

An xarray.DataArray representation of represents the time series.

Return type

xarray.DataArray

diff(n=1, periods=1, dropna=True)[source]

Return a new series with differenced values.

This is often used to make a time series stationary.

Parameters
  • n (Optional[int, None]) – Optionally, a positive integer indicating the number of differencing steps (default = 1). For instance, n=2 computes the second order differences.

  • periods (Optional[int, None]) – Optionally, periods to shift for calculating difference. For instance, periods=12 computes the difference between values at time t and times t-12.

  • dropna (Optional[bool, None]) – Whether to drop the missing values after each differencing steps. If set to False, the corresponding first periods time steps will be filled with NaNs.

Returns

A new series, with the differenced values.

Return type

TimeSeries

drop_after(split_point)[source]

Return a new series where everything after and including the provided time split_point was dropped.

The timestamp may not be in the series. If it is, the timestamp will be dropped.

Parameters

split_point (Union[Timestamp, float, int]) – The timestamp that indicates cut-off time.

Returns

A series that contains all entries until split_point (exclusive).

Return type

TimeSeries

drop_before(split_point)[source]

Return a new series where everything before and including the provided time split_point was dropped.

The timestamp may not be in the series. If it is, the timestamp will be dropped.

Parameters

split_point (Union[Timestamp, float, int]) – The timestamp that indicates cut-off time.

Returns

A series that contains all entries starting after split_point (exclusive).

Return type

TimeSeries

drop_columns(col_names)[source]

Return a new series with dropped components (columns).

Parameters

col_names (Union[list[str], str]) – String or list of strings corresponding to the columns to be dropped.

Returns

A new series with the specified columns dropped.

Return type

TimeSeries

property dtype

The dtype of the series’ values.

property duration: Union[Timedelta, int]

The duration of the series (as a pandas.Timedelta or int).

Return type

Union[Timedelta, int]

end_time()[source]

End time of the series.

Returns

A timestamp containing the last time of the TimeSeries (if indexed by DatetimeIndex), or an integer (if indexed by RangeIndex)

Return type

Union[pandas.Timestamp, int]

first_value()[source]

First value of the univariate series.

Returns

The first value of this univariate deterministic time series

Return type

float

first_values()[source]

First values of the potentially multivariate series.

Returns

The first values of every component of this deterministic time series

Return type

numpy.ndarray

property freq: Union[DateOffset, int]

The frequency of the series.

A pandas.DateOffset if the series is indexed with a pandas.DatetimeIndex. An integer (step size) if the series is indexed with a pandas.RangeIndex.

Return type

Union[DateOffset, int]

property freq_str: str

The string representation of the series’ frequency.

Return type

str

classmethod from_csv(filepath_or_buffer, time_col=None, value_cols=None, fill_missing_dates=False, freq=None, fillna_value=None, static_covariates=None, hierarchy=None, metadata=None, **kwargs)[source]

Create a TimeSeries from a CSV file.

One column can be used to represent the time (if not present, the time index will be a RangeIndex) and a list of columns value_cols can be used to indicate the values for this time series.

Parameters
  • filepath_or_buffer – The path to the CSV file, or the file object; consistent with the argument of pandas.read_csv function

  • time_col (Optional[str, None]) – The time column name. If set, the column will be cast to a pandas DatetimeIndex (if it contains timestamps) or a RangeIndex (if it contains integers). If not set, the pandas RangeIndex will be used.

  • value_cols (Union[str, list[str], None]) – A string or list of strings representing the value column(s) to be extracted from the CSV file. If set to None, all columns from the CSV file will be used (except for the time_col, if specified)

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) –

    Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • static_covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series or a pandas DataFrame. If a Series, the index represents the static variables. The covariates are globally ‘applied’ to all components of the TimeSeries. If a DataFrame, the columns represent the static variables and the rows represent the components of the uni/multivariate TimeSeries. If a single-row DataFrame, the covariates are globally ‘applied’ to all components of the TimeSeries. If a multi-row DataFrame, the number of rows must match the number of components of the TimeSeries (in this case, the number of columns in the CSV file). This adds control for component-specific static covariates.

  • hierarchy (Optional[dict, None]) –

    Optionally, a dictionary describing the grouping(s) of the time series. The keys are component names, and for a given component name c, the value is a list of component names that c “belongs” to. For instance, if there is a total component, split both in two divisions d1 and d2 and in two regions r1 and r2, and four products d1r1 (in division d1 and region r1), d2r1, d1r2 and d2r2, the hierarchy would be encoded as follows.

    hierarchy={
        "d1r1": ["d1", "r1"],
        "d1r2": ["d1", "r2"],
        "d2r1": ["d2", "r1"],
        "d2r2": ["d2", "r2"],
        "d1": ["total"],
        "d2": ["total"],
        "r1": ["total"],
        "r2": ["total"]
    }
    

    The hierarchy can be used to reconcile forecasts (so that the sums of the forecasts at different levels are consistent), see hierarchical reconciliation.

  • metadata (Optional[dict, None]) – Optionally, a dictionary with metadata to be added to the TimeSeries.

  • **kwargs – Optional arguments to be passed to pandas.read_csv function

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> from darts import TimeSeries
>>> TimeSeries.from_csv("data.csv", time_col="time")
classmethod from_dataframe(df, time_col=None, value_cols=None, fill_missing_dates=False, freq=None, fillna_value=None, static_covariates=None, hierarchy=None, metadata=None, copy=True)[source]

Create a TimeSeries from a selection of columns of a DataFrame.

One column (or the DataFrame index) has to represent the time, and a list of columns value_cols has to represent the values for this time series.

Parameters
  • df (Union[ForwardRef, ForwardRef]) – The DataFrame, or anything which can be converted to a narwhals DataFrame (e.g. pandas.DataFrame, polars.DataFrame, …). See the narwhals documentation for more information.

  • time_col (Optional[str, None]) – The time column name. If set, the column will be cast to a pandas DatetimeIndex (if it contains timestamps) or a RangeIndex (if it contains integers). If not set, the DataFrame index will be used. In this case the DataFrame must contain an index that is either a pandas DatetimeIndex, a pandas RangeIndex, or a pandas Index that can be converted to a RangeIndex. It is better if the index has no holes; alternatively setting fill_missing_dates can in some cases solve these issues (filling holes with NaN, or with the provided fillna_value numeric value, if any).

  • value_cols (Union[str, list[str], None]) – A string or list of strings representing the value column(s) to be extracted from the DataFrame. If set to None, the whole DataFrame will be used.

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) –

    Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • static_covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series or a pandas DataFrame. If a Series, the index represents the static variables. The covariates are globally ‘applied’ to all components of the TimeSeries. If a DataFrame, the columns represent the static variables and the rows represent the components of the uni/multivariate TimeSeries. If a single-row DataFrame, the covariates are globally ‘applied’ to all components of the TimeSeries. If a multi-row DataFrame, the number of rows must match the number of components of the TimeSeries (in this case, the number of columns in value_cols). This adds control for component-specific static covariates.

  • hierarchy (Optional[dict, None]) –

    Optionally, a dictionary describing the grouping(s) of the time series. The keys are component names, and for a given component name c, the value is a list of component names that c “belongs” to. For instance, if there is a total component, split both in two divisions d1 and d2 and in two regions r1 and r2, and four products d1r1 (in division d1 and region r1), d2r1, d1r2 and d2r2, the hierarchy would be encoded as follows.

    hierarchy={
        "d1r1": ["d1", "r1"],
        "d1r2": ["d1", "r2"],
        "d2r1": ["d2", "r1"],
        "d2r2": ["d2", "r2"],
        "d1": ["total"],
        "d2": ["total"],
        "r1": ["total"],
        "r2": ["total"]
    }
    

    The hierarchy can be used to reconcile forecasts (so that the sums of the forecasts at different levels are consistent), see hierarchical reconciliation.

  • metadata (Optional[dict, None]) – Optionally, a dictionary with metadata to be added to the TimeSeries.

  • copy (bool) – Whether to copy the times (DataFrame index or the time_col column) and DataFrame values. If copy=False, mutating the series data will affect the original data. Additionally, if times lack a frequency or step size, it will be assigned to the original object.

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> import pandas as pd
>>> from darts import TimeSeries
>>> from darts.utils.utils import generate_index
>>> # create values and times with daily frequency
>>> data = {"vals": range(3), "time": generate_index("2020-01-01", length=3, freq="D")}
>>> # create from `pandas.DataFrame`
>>> df = pd.DataFrame(data)
>>> series = TimeSeries.from_dataframe(df, time_col="time")
>>> # shape (n time steps, n components, n samples)
>>> series.shape
(3, 1, 1)
>>> # or from `polars.DataFrame` (make sure Polars is installed)
>>> import polars as pl
>>> df = pl.DataFrame(data)
>>> series = TimeSeries.from_dataframe(df, time_col="time")
>>> series.shape
(3, 1, 1)
classmethod from_group_dataframe(df, group_cols, time_col=None, value_cols=None, static_cols=None, metadata_cols=None, fill_missing_dates=False, freq=None, fillna_value=None, drop_group_cols=None, n_jobs=1, verbose=False, copy=True)[source]

Create a list of TimeSeries grouped by a selection of columns from a DataFrame.

One column (or the DataFrame index) has to represent the time, a list of columns group_cols must be used for extracting the individual TimeSeries by groups, and a list of columns value_cols has to represent the values for the individual time series. Values from columns group_cols and static_cols are added as static covariates to the resulting TimeSeries objects. These can be viewed with my_series.static_covariates. Different to group_cols, static_cols only adds the static values but are not used to extract the TimeSeries groups.

Parameters
  • df (DataFrame) – The DataFrame

  • group_cols (Union[list[str], str]) – A string or list of strings representing the columns from the DataFrame by which to extract the individual TimeSeries groups.

  • time_col (Optional[str, None]) – The time column name. If set, the column will be cast to a pandas DatetimeIndex (if it contains timestamps) or a RangeIndex (if it contains integers). If not set, the DataFrame index will be used. In this case the DataFrame must contain an index that is either a pandas DatetimeIndex, a pandas RangeIndex, or a pandas Index that can be converted to a RangeIndex. Be aware that the index must represents the actual index of each individual time series group (can contain non-unique values). It is better if the index has no holes; alternatively setting fill_missing_dates can in some cases solve these issues (filling holes with NaN, or with the provided fillna_value numeric value, if any).

  • value_cols (Union[str, list[str], None]) – A string or list of strings representing the value column(s) to be extracted from the DataFrame. If set to None, the whole DataFrame will be used.

  • static_cols (Union[str, list[str], None]) – A string or list of strings representing static variable columns from the DataFrame that should be appended as static covariates to the resulting TimeSeries groups. Different to group_cols, the DataFrame is not grouped by these columns. Uses the first encountered value per group and column (assumes that there is only one unique value). Static covariates can be used as input features to all Darts models that support it.

  • metadata_cols (Union[str, list[str], None]) – Same as static_cols but appended as metadata to the resulting TimeSeries groups. Metadata will never be used by the underlying Darts models.

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) –

    Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • drop_group_cols (Union[str, list[str], None]) – Optionally, a string or list of strings with group_cols column(s) to exclude from the static covariates.

  • n_jobs (Optional[int, None]) – Optionally, an integer representing the number of parallel jobs to run. Behavior is the same as in the joblib.Parallel class.

  • verbose (Optional[bool, None]) – Optionally, a boolean value indicating whether to display a progress bar.

  • copy (bool) – Whether to copy the times (DataFrame index or the time_col column) and DataFrame values. If copy=False, mutating the series data will affect the original data. Additionally, if times lack a frequency or step size, it will be assigned to the original object.

Returns

A list of series, where each series represents one group from the DataFrame.

Return type

List[TimeSeries]

Examples

>>> import pandas as pd
>>> from darts import TimeSeries
>>> from darts.utils.utils import generate_index
>>>
>>> # create a DataFrame with two series that have different ids,
>>> # values, and frequencies
>>> df_1 = pd.DataFrame({
>>>     "ID": [0] * 3,
>>>     "vals": range(3),
>>>     "time": generate_index("2020-01-01", length=3, freq="D")}
>>> )
>>> df_2 = pd.DataFrame({
>>>     "ID": [1] * 6,
>>>     "vals": range(6),
>>>     "time": generate_index("2020-01-01", length=6, freq="h")}
>>> )
>>> df = pd.concat([df_1, df_2], axis=0)
>>>
>>> # extract the series by "ID" groups from the DataFrame
>>> series_multi = TimeSeries.from_group_dataframe(
>>>     df,
>>>     group_cols="ID",
>>>     time_col="time"
>>> )
>>> len(series_multi), series_multi[0].shape, series_multi[1].shape
(2, (3, 1, 1), (6, 1, 1))
classmethod from_json(json_str, static_covariates=None, hierarchy=None, metadata=None)[source]

Create a TimeSeries from the JSON String representation of a TimeSeries.

The JSON String representation can be generated with TimeSeries.to_json().

At the moment this only supports deterministic time series (i.e., made of 1 sample).

Parameters
  • json_str (str) – The JSON String to convert.

  • static_covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series or a pandas DataFrame. If a Series, the index represents the static variables. The covariates are globally ‘applied’ to all components of the TimeSeries. If a DataFrame, the columns represent the static variables and the rows represent the components of the uni/multivariate TimeSeries. If a single-row DataFrame, the covariates are globally ‘applied’ to all components of the TimeSeries. If a multi-row DataFrame, the number of rows must match the number of components of the TimeSeries (in this case, the number of columns in value_cols). This adds control for component-specific static covariates.

  • hierarchy (Optional[dict, None]) –

    Optionally, a dictionary describing the grouping(s) of the time series. The keys are component names, and for a given component name c, the value is a list of component names that c “belongs” to. For instance, if there is a total component, split both in two divisions d1 and d2 and in two regions r1 and r2, and four products d1r1 (in division d1 and region r1), d2r1, d1r2 and d2r2, the hierarchy would be encoded as follows.

    hierarchy={
        "d1r1": ["d1", "r1"],
        "d1r2": ["d1", "r2"],
        "d2r1": ["d2", "r1"],
        "d2r2": ["d2", "r2"],
        "d1": ["total"],
        "d2": ["total"],
        "r1": ["total"],
        "r2": ["total"]
    }
    

    The hierarchy can be used to reconcile forecasts (so that the sums of the forecasts at different levels are consistent), see hierarchical reconciliation.

  • metadata (Optional[dict, None]) – Optionally, a dictionary with metadata to be added to the TimeSeries.

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> from darts import TimeSeries
>>> json_str = (
>>>     '{"columns":["vals"],"index":["2020-01-01","2020-01-02","2020-01-03"],"data":[[0.0],[1.0],[2.0]]}'
>>> )
>>> series = TimeSeries.from_json("data.csv")
>>> series.shape
(3, 1, 1)
classmethod from_pickle(path)[source]

Read a pickled TimeSeries.

Parameters

path (string) – path pointing to a pickle file that will be loaded.

Returns

The resulting series.

Return type

TimeSeries

classmethod from_series(pd_series, fill_missing_dates=False, freq=None, fillna_value=None, static_covariates=None, metadata=None, copy=True)[source]

Create a TimeSeries from a Series.

The series must contain an index that is either a pandas DatetimeIndex, a pandas RangeIndex, or a pandas Index that can be converted into a RangeIndex. It is better if the index has no holes; alternatively setting fill_missing_dates can in some cases solve these issues (filling holes with NaN, or with the provided fillna_value numeric value, if any).

Parameters
  • pd_series (NativeSeries) –

    The Series, or anything which can be converted to a narwhals Series (e.g. pandas.Series, …). See the narwhals documentation for more information.

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) –

    Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • static_covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series or a single-row pandas DataFrame. If a Series, the index represents the static variables. If a DataFrame, the columns represent the static variables and the single row represents the univariate TimeSeries component.

  • metadata (Optional[dict, None]) – Optionally, a dictionary with metadata to be added to the TimeSeries.

  • copy (bool) – Whether to copy the Series’ values. If copy=False, mutating the series data will affect the original data.

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> import pandas as pd
>>> from darts import TimeSeries
>>> from darts.utils.utils import generate_index
>>> # create values and times with daily frequency
>>> vals, times = range(3), generate_index("2020-01-01", length=3, freq="D")
>>>
>>> # create from `pandas.Series`
>>> pd_series = pd.Series(vals, index=times)
>>> series = TimeSeries.from_series(pd_series)
>>> series.shape
(3, 1, 1)
classmethod from_times_and_values(times, values, fill_missing_dates=False, freq=None, columns=None, fillna_value=None, static_covariates=None, hierarchy=None, metadata=None, copy=True)[source]

Create a TimeSeries from a time index and value array.

Parameters
  • times (Union[DatetimeIndex, RangeIndex, Index]) – A pandas DateTimeIndex, RangeIndex, or Index that can be converted to a RangeIndex representing the time axis for the time series. It is better if the index has no holes; alternatively setting fill_missing_dates can in some cases solve these issues (filling holes with NaN, or with the provided fillna_value numeric value, if any).

  • values (ndarray) – A Numpy array, or array-like of values for the TimeSeries. Both 2-dimensional arrays, for deterministic series, and 3-dimensional arrays, for probabilistic series, are accepted. In the former case the dimensions should be (time, component), and in the latter case (time, component, sample).

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) –

    Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • columns (Union[ForwardRef, ndarray, ForwardRef, ForwardRef, SequenceNotStr, range, None]) – Optionally, some column names to use for the second values dimension.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • static_covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series or a pandas DataFrame. If a Series, the index represents the static variables. The covariates are globally ‘applied’ to all components of the TimeSeries. If a DataFrame, the columns represent the static variables and the rows represent the components of the uni/multivariate TimeSeries. If a single-row DataFrame, the covariates are globally ‘applied’ to all components of the TimeSeries. If a multi-row DataFrame, the number of rows must match the number of components of the TimeSeries (in this case, the number of columns in values). This adds control for component-specific static covariates.

  • hierarchy (Optional[dict, None]) –

    Optionally, a dictionary describing the grouping(s) of the time series. The keys are component names, and for a given component name c, the value is a list of component names that c “belongs” to. For instance, if there is a total component, split both in two divisions d1 and d2 and in two regions r1 and r2, and four products d1r1 (in division d1 and region r1), d2r1, d1r2 and d2r2, the hierarchy would be encoded as follows.

    hierarchy={
        "d1r1": ["d1", "r1"],
        "d1r2": ["d1", "r2"],
        "d2r1": ["d2", "r1"],
        "d2r2": ["d2", "r2"],
        "d1": ["total"],
        "d2": ["total"],
        "r1": ["total"],
        "r2": ["total"]
    }
    

    The hierarchy can be used to reconcile forecasts (so that the sums of the forecasts at different levels are consistent), see hierarchical reconciliation.

  • metadata (Optional[dict, None]) – Optionally, a dictionary with metadata to be added to the TimeSeries.

  • copy (bool) – Whether to copy the times and values objects. If copy=False, mutating the series data will affect the original data. Additionally, if times lack a frequency or step size, it will be assigned to the original object.

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> import numpy as np
>>> from darts import TimeSeries
>>> from darts.utils.utils import generate_index
>>> # create values and times with daily frequency
>>> vals, times = np.arange(3), generate_index("2020-01-01", length=3, freq="D")
>>> series = TimeSeries.from_times_and_values(times=times, values=vals)
>>> series.shape
(3, 1, 1)
classmethod from_values(values, columns=None, fillna_value=None, static_covariates=None, hierarchy=None, metadata=None, copy=True)[source]

Create an TimeSeries from an array of values.

The series will have an integer time index (RangeIndex).

Parameters
  • values (ndarray) – A Numpy array of values for the TimeSeries. Both 2-dimensional arrays, for deterministic series, and 3-dimensional arrays, for probabilistic series, are accepted. In the former case the dimensions should be (time, component), and in the latter case (time, component, sample).

  • columns (Union[ForwardRef, ndarray, ForwardRef, ForwardRef, SequenceNotStr, range, None]) – Columns to be used by the underlying pandas DataFrame.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • static_covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series or a pandas DataFrame. If a Series, the index represents the static variables. The covariates are globally ‘applied’ to all components of the TimeSeries. If a DataFrame, the columns represent the static variables and the rows represent the components of the uni/multivariate TimeSeries. If a single-row DataFrame, the covariates are globally ‘applied’ to all components of the TimeSeries. If a multi-row DataFrame, the number of rows must match the number of components of the TimeSeries (in this case, the number of columns in values). This adds control for component-specific static covariates.

  • hierarchy (Optional[dict, None]) –

    Optionally, a dictionary describing the grouping(s) of the time series. The keys are component names, and for a given component name c, the value is a list of component names that c “belongs” to. For instance, if there is a total component, split both in two divisions d1 and d2 and in two regions r1 and r2, and four products d1r1 (in division d1 and region r1), d2r1, d1r2 and d2r2, the hierarchy would be encoded as follows.

    hierarchy={
        "d1r1": ["d1", "r1"],
        "d1r2": ["d1", "r2"],
        "d2r1": ["d2", "r1"],
        "d2r2": ["d2", "r2"],
        "d1": ["total"],
        "d2": ["total"],
        "r1": ["total"],
        "r2": ["total"]
    }
    

    The hierarchy can be used to reconcile forecasts (so that the sums of the forecasts at different levels are consistent), see hierarchical reconciliation.

  • metadata (Optional[dict, None]) – Optionally, a dictionary with metadata to be added to the TimeSeries.

  • copy (bool) – Whether to copy the values. If copy=False, mutating the series data will affect the original data.

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> import numpy as np
>>> from darts import TimeSeries
>>> from darts.utils.utils import generate_index
>>> vals = np.arange(3)
>>> series = TimeSeries.from_times_and_values(times=times, values=vals)
>>> series.shape
(3, 1, 1)
classmethod from_xarray(xa, fill_missing_dates=False, freq=None, fillna_value=None, copy=True)[source]

Create a TimeSeries from an xarray.DataArray.

The dimensions of the DataArray have to be (time, component, sample), in this order. The time dimension can have an arbitrary name, but component and sample must be named “component” and “sample”, respectively.

The first dimension (time), and second dimension (component) must be indexed (i.e., have coordinates). The time must be indexed either with a pandas DatetimeIndex, a pandas RangeIndex, or a pandas Index that can be converted to a RangeIndex. It is better if the index has no holes; alternatively setting fill_missing_dates can in some cases solve these issues (filling holes with NaN, or with the provided fillna_value numeric value, if any).

If two components have the same name or are not strings, this method will disambiguate the components names by appending a suffix of the form “<name>_N” to the N-th column with name “name”. The component names in the static covariates and hierarchy (if any) are not disambiguated.

Parameters
  • xa (DataArray) – The xarray.DataArray

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) –

    Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

  • copy (bool) – Whether to copy the times (time index dimension) and values (data) objects. If copy=False, mutating the series data will affect the original data. Additionally, if times lack a frequency or step size, it will be assigned to the original object.

Returns

The resulting series.

Return type

TimeSeries

Examples

>>> import xarray as xr
>>> import numpy as np
>>> from darts.timeseries import DIMS
>>> from darts import TimeSeries
>>> from darts.utils.utils import generate_index
>>>
>>> # create values with the required dimensions (time, component, sample)
>>> vals = np.random.random((3, 1, 1))
>>> # create time index with daily frequency
>>> times = generate_index("2020-01-01", length=3, freq="D")
>>> columns = ["vals"]
>>>
>>> # create xarray with the required dimensions and coordinates
>>> xa = xr.DataArray(
>>>     vals,
>>>     dims=DIMS,
>>>     coords={DIMS[0]: times, DIMS[1]: columns}
>>> )
>>> series = TimeSeries.from_xarray(xa)
>>> series.shape
(3, 1, 1)
gaps(mode='all')[source]

Compute and return gaps in the series.

Works only on deterministic time series (1 sample).

Parameters

mode (Literal[‘all’, ‘any’]) – Only relevant for multivariate time series. The mode defines how gaps are defined. Set to ‘any’ if a NaN value in any columns should be considered as as gaps. ‘all’ will only consider periods where all columns’ values are NaN. Defaults to ‘all’.

Returns

A pandas.DataFrame containing a row for every gap (rows with all-NaN values in underlying DataFrame) in this time series. The DataFrame contains three columns that include the start and end time stamps of the gap and the integer length of the gap (in self.freq units if the series is indexed by a DatetimeIndex).

Return type

pandas.DataFrame

get_index_at_point(point, after=True)[source]

Convert a point along the time index into an integer index ranging from (0, len(series)-1) inclusive.

Parameters
  • point (Union[Timestamp, float, int]) –

    This parameter supports 3 different data types: pandas.Timestamp, float and int.

    pandas.Timestamp work only on series that are indexed with a pandas.DatetimeIndex. In such cases, the returned point will be the index of this timestamp if it is present in the series time index. If it’s not present in the time index, the index of the next timestamp is returned if after=True (if it exists in the series), otherwise the index of the previous timestamp is returned (if it exists in the series).

    In case of a float, the parameter will be treated as the proportion of the time series that should lie before the point.

    If an int and series is datetime-indexed, the value of point is returned. If an int and series is integer-indexed, the index position of point in the RangeIndex is returned (accounting for steps).

  • after – If the provided pandas Timestamp is not in the time series index, whether to return the index of the next timestamp or the index of the previous one.

Returns

The index position corresponding to the provided point in the series.

Return type

int

get_timestamp_at_point(point)[source]

Convert a point into a pandas.Timestamp (if datetime-indexed) or integer (if integer-indexed).

Parameters

point (Union[Timestamp, float, int]) – This parameter supports 3 different data types: float, int and pandas.Timestamp. In case of a float, the parameter will be treated as the proportion of the time series that should lie before the point. In case of int, the parameter will be treated as an integer index to the time index of series. Will raise a ValueError if not a valid index in series. In case of a pandas.Timestamp, point will be returned as is provided that the timestamp is present in the series time index, otherwise will raise a ValueError.

Returns

The index value corresponding to the provided point in the series. If the series is indexed by a pandas.DatetimeIndex, returns a pandas.Timestamp. If the series is indexed by a pandas.RangeIndex, returns an integer.

Return type

Union[pandas.Timestamp, int]

property has_datetime_index: bool

Whether the series is indexed with a pandas.DatetimeIndex (otherwise it is indexed with an pandas.RangeIndex).

Return type

bool

property has_hierarchy: bool

Whether the series contains a hierarchy.

Return type

bool

property has_metadata: bool

Whether the series contains metadata.

Return type

bool

property has_range_index: bool

Whether the series is indexed with an pandas.RangeIndex (otherwise it is indexed with a pandas.DatetimeIndex).

Return type

bool

has_same_time_as(other)[source]

Whether the series has the same time index as the other series.

Parameters

other (Self) – the other series

Returns

True if both series have the same index, False otherwise.

Return type

bool

property has_static_covariates: bool

Whether the series contains static covariates.

Return type

bool

head(size=5, axis=0)[source]

Return a new series with the first size points.

Parameters
  • size (int, default 5) – number of points to retain

  • axis (str or int, optional, default: 0) – axis along which to slice the series

Returns

The series made of the first size points along the desired axis.

Return type

TimeSeries

property hierarchy: Optional[dict]

The hierarchy of this series.

If defined, the hierarchy is given as a dictionary. The keys are the individual components and values are the set of parent(s) of these components in the hierarchy.

Return type

Optional[dict, None]

property is_deterministic: bool

Whether the series is deterministic.

Return type

bool

property is_probabilistic: bool

Whether the series is stochastic (probabilistic).

Return type

bool

property is_stochastic: bool

Whether the series is stochastic (probabilistic).

Return type

bool

property is_univariate: bool

Whether the series is univariate.

Return type

bool

is_within_range(ts)[source]

Whether the given timestamp or integer is within the time interval of the series.

ts does not need to be an element of the series’ time index.

Parameters

ts (Union[Timestamp, int]) – The pandas.Timestamp (if indexed with DatetimeIndex) or integer (if indexed with RangeIndex) to check.

Returns

Whether ts is contained within the interval of this series.

Return type

bool

kurtosis(**kwargs)[source]

Return a deterministic series with the kurtosis of each component computed over the samples of the stochastic series.

This works only on stochastic series (i.e., with more than 1 sample)

Parameters

kwargs – Other keyword arguments are passed down to scipy.stats.kurtosis()

Returns

A new series containing the kurtosis of each component.

Return type

TimeSeries

last_value()[source]

Last value of the univariate series.

Returns

The last value of this univariate deterministic time series

Return type

float

last_values()[source]

Last values of the potentially multivariate series.

Returns

The last values of every component of this deterministic time series

Return type

numpy.ndarray

longest_contiguous_slice(max_gap_size=0, mode='all')[source]

Return the largest slice of the deterministic series without any gaps (contiguous all-NaN value entries) larger than max_gap_size.

This method is only applicable to deterministic series (i.e., having 1 sample).

Parameters
  • max_gap_size (int) – Indicate the maximum gap size that the series can contain.

  • mode (str) – Only relevant for multivariate time series. The mode defines how gaps are defined. Set to ‘any’ if a NaN value in any columns should be considered as as gaps. ‘all’ will only consider periods where all columns’ values are NaN. Defaults to ‘all’.

Returns

A new series with the largest slice of the original that has no gaps longer than max_gap_size.

Return type

TimeSeries

See also

TimeSeries.gaps

return the gaps in the TimeSeries

map(fn)[source]

Return a new series with the function fn applied to the values of this series.

If fn takes 1 argument it is simply applied on the values array of shape (time, n_components, n_samples). If fn takes 2 arguments, it is applied repeatedly on the (ts, value[ts]) tuples, where ts denotes a timestamp value, and value[ts] denotes the array of values at this timestamp, of shape (n_components, n_samples).

Parameters

fn (Union[Callable[[number], number], Callable[[Union[Timestamp, int], number], number]]) – Either a function which takes a NumPy array and returns a NumPy array of same shape; e.g., lambda x: x ** 2, lambda x: x / x.shape[0] or np.log. It can also be a function which takes a timestamp and array, and returns a new array of same shape; e.g., lambda ts, x: x / ts.days_in_month. The type of ts is either pandas.Timestamp (if the series is indexed with a DatetimeIndex), or an integer otherwise (if the series is indexed with an RangeIndex).

Returns

A new series with the function fn applied to the values.

Return type

TimeSeries

max(axis=2)[source]

Return a new series with the maximum computed over the specified axis.

If we reduce over time (axis=0), the series will have length one and will use the first entry of the original time_index. If we perform the calculation over the components (axis=1), the resulting single component will be renamed to “components_max”. When applied to the samples (axis=2), a deterministic series is returned.

If axis=1, the static covariates and the hierarchy are discarded from the series.

Parameters

axis (int) – The axis to reduce over. The default is to calculate over samples, i.e. axis=2.

Returns

A new series with max applied to the indicated axis.

Return type

TimeSeries

mean(axis=2)[source]

Return a new series with the mean computed over the specified axis.

If we reduce over time (axis=0), the series will have length one and will use the first entry of the original time_index. If we perform the calculation over the components (axis=1), the resulting single component will be renamed to “components_mean”. When applied to the samples (axis=2), a deterministic series is returned.

If axis=1, the static covariates and the hierarchy are discarded from the series.

Parameters

axis (int) – The axis to reduce over. The default is to calculate over samples, i.e. axis=2.

Returns

A new series with mean applied to the indicated axis.

Return type

TimeSeries

median(axis=2)[source]

Return a new series with the median computed over the specified axis.

If we reduce over time (axis=0), the series will have length one and will use the first entry of the original time_index. If we perform the calculation over the components (axis=1), the resulting single component will be renamed to “components_median”. When applied to the samples (axis=2), a deterministic series is returned.

If axis=1, the static covariates and the hierarchy are discarded from the series.

Parameters

axis (int) – The axis to reduce over. The default is to calculate over samples, i.e. axis=2.

Returns

A new series with median applied to the indicated axis.

Return type

TimeSeries

property metadata: Optional[dict]

The metadata of this series.

If defined, the metadata is given as a dictionary.

Return type

Optional[dict, None]

min(axis=2)[source]

Return a new series with the minimum computed over the specified axis.

If we reduce over time (axis=0), the series will have length one and will use the first entry of the original time_index. If we perform the calculation over the components (axis=1), the resulting single component will be renamed to “components_min”. When applied to the samples (axis=2), a deterministic series is returned.

If axis=1, the static covariates and the hierarchy are discarded from the series.

Parameters

axis (int) – The axis to reduce over. The default is to calculate over samples, i.e. axis=2.

Returns

A new series with min applied to the indicated axis.

Return type

TimeSeries

property n_components: int

The number of components (columns) in the series.

Return type

int

property n_samples: int

The number of samples contained in the series.

Return type

int

property n_timesteps: int

The number of time steps in the series.

Return type

int

plot(new_plot=False, central_quantile=0.5, low_quantile=0.05, high_quantile=0.95, default_formatting=True, title=None, label='', max_nr_components=10, ax=None, alpha=None, color=None, c=None, *args, **kwargs)[source]

Plot the series.

This is a wrapper method around xarray.DataArray.plot().

Parameters
  • new_plot (bool) – Whether to spawn a new axis to plot on. See also parameter ax.

  • central_quantile (Union[float, str]) – The quantile (between 0 and 1) to plot as a “central” value, if the series is stochastic (i.e., if it has multiple samples). This will be applied on each component separately (i.e., to display quantiles of the components’ marginal distributions). For instance, setting central_quantile=0.5 will plot the median of each component. central_quantile can also be set to ‘mean’.

  • low_quantile (Optional[float, None]) – The quantile to use for the lower bound of the plotted confidence interval. Similar to central_quantile, this is applied to each component separately (i.e., displaying marginal distributions). No confidence interval is shown if confidence_low_quantile is None (default 0.05).

  • high_quantile (Optional[float, None]) – The quantile to use for the upper bound of the plotted confidence interval. Similar to central_quantile, this is applied to each component separately (i.e., displaying marginal distributions). No confidence interval is shown if high_quantile is None (default 0.95).

  • default_formatting (bool) – Whether to use the darts default scheme.

  • title (Optional[str, None]) – Optionally, a custom plot title. If None, will use the name of the underlying xarray.DataArray.

  • label (Union[str, Sequence[str], None]) – Can either be a string or list of strings. If a string and the series only has a single component, it is used as the label for that component. If a string and the series has multiple components, it is used as a prefix for each component name. If a list of strings with length equal to the number of components in the series, the labels will be mapped to the components in order.

  • max_nr_components (int) – The maximum number of components of a series to plot. -1 means all components will be plotted.

  • ax (Optional[Axes, None]) – Optionally, an axis to plot on. If None, and new_plot=False, will use the current axis. If new_plot=True, will create a new axis.

  • alpha (Optional[float, None]) – Optionally, set the line alpha for deterministic series, or the confidence interval alpha for probabilistic series.

  • color (Union[str, tuple, Sequence[str, tuple], None]) – Can either be a single color or list of colors. Any matplotlib color is accepted (string, hex string, RGB/RGBA tuple). If a single color and the series only has a single component, it is used as the color for that component. If a single color and the series has multiple components, it is used as the color for each component. If a list of colors with length equal to the number of components in the series, the colors will be mapped to the components in order.

  • c (Union[str, tuple, Sequence[str, tuple], None]) – An alias for color.

  • args – some positional arguments for the plot() method

  • kwargs – some keyword arguments for the plot() method

Returns

Either the passed ax axis, a newly created one if new_plot=True, or the existing one.

Return type

matplotlib.axes.Axes

prepend(other)[source]

Return a new series with the other series prepended to this series along the time axis (added to the beginning).

Parameters

other (Self) – A second TimeSeries.

Returns

A new series, obtained by prepending the second series to the first.

Return type

TimeSeries

See also

TimeSeries.concatenate

concatenate another series along a given axis.

TimeSeries.append

append another series along the time axis.

prepend_values(values)[source]

Return a new series with values prepended to this series along the time axis (added to the beginning).

This adds time steps to the beginning of the new series.

Parameters

values (ndarray) – An array with the values to prepend to the start.

Returns

A new series with the new values prepended.

Return type

TimeSeries

See also

TimeSeries.append_values

append the values of another series along the time axis.

quantile(q=0.5, **kwargs)[source]

Return a deterministic series with the desired quantile(s) q of each component computed over the samples of the stochastic series.

The component quantiles in the new series are named “<component>_q<quantile>”, where “<component>” is the column name, and “<quantile>” is the quantile value.

The order of the component quantiles is: [<c_1>_q<q_1>, … <c_1>_q<q_2>, …, <c_n>_q<q_n>].

This works only on stochastic series (i.e., with more than 1 sample).

Parameters
  • q (Union[float, Sequence[float]]) – The desired quantile value or sequence of quantile values. Each value must be between 0. and 1. inclusive. For instance, 0.5 will return a TimeSeries containing the median of the (marginal) distribution of each component.

  • kwargs – Other keyword arguments are passed down to numpy.quantile().

Returns

A new series containing the desired quantile(s) of each component.

Return type

TimeSeries

random_component_values(copy=True)[source]

Return a 2-D array of shape (time, component), containing the series’ values for one sample taken uniformly at random from all samples.

Parameters

copy (bool) – Whether to return a copy of the values, otherwise returns a view. Leave it to True unless you know what you are doing.

Returns

The values composing one sample taken at random from the time series.

Return type

numpy.ndarray

resample(freq, method='pad', method_kwargs=None, **kwargs)[source]

Return a new series where the time index and values were resampled with a given frequency.

The provided method is used to aggregate/fill holes in the resampled series, by default ‘pad’.

Parameters
  • freq (Union[str, DateOffset]) – The new time difference between two adjacent entries in the returned TimeSeries. Expects a pandas.DateOffset or DateOffset alias.

  • method (str) – A method to either aggregate grouped values (for down-sampling) or fill holes (for up-sampling) in the reindexed TimeSeries. For more information, see the xarray DataArrayResample documentation. Supported methods: [“all”, “any”, “asfreq”, “backfill”, “bfill”, “count”, “ffill”, “first”, “interpolate”, “last”, “max”, “mean”, “median”, “min”, “nearest”, “pad”, “prod”, “quantile”, “reduce”, “std”, “sum”, “var”].

  • method_kwargs (Optional[dict[str, Any], None]) – Additional keyword arguments for the specified method. Some methods require additional arguments. Xarray’s errors will be raised on invalid keyword arguments.

  • kwargs – some keyword arguments for the xarray.resample method, notably offset or base to indicate where to start the resampling and avoid nan at the first value of the resampled TimeSeries For more information, see the xarray resample() documentation.

Returns

A resampled series with given frequency.

Return type

TimeSeries

Examples

>>> times = pd.date_range(start=pd.Timestamp("20200101233000"), periods=6, freq="15min")
>>> pd_series = pd.Series(range(6), index=times)
>>> ts = TimeSeries.from_series(pd_series)
>>> print(ts.time_index)
DatetimeIndex(['2020-01-01 23:30:00', '2020-01-01 23:45:00',
               '2020-01-02 00:00:00', '2020-01-02 00:15:00',
               '2020-01-02 00:30:00', '2020-01-02 00:45:00'],
               dtype='datetime64[ns]', name='time', freq='15T')
>>> resampled_nokwargs_ts = ts.resample(freq="1h")
>>> print(resampled_nokwargs_ts.time_index)
DatetimeIndex(['2020-01-01 23:00:00', '2020-01-02 00:00:00'],
              dtype='datetime64[ns]', name='time', freq='H')
>>> print(resampled_nokwargs_ts.values())
[[nan]
[ 2.]]
>>> resampled_ts = ts.resample(freq="1h", offset=pd.Timedelta("30min"))
>>> print(resampled_ts.time_index)
DatetimeIndex(['2020-01-01 23:30:00', '2020-01-02 00:30:00'],
              dtype='datetime64[ns]', name='time', freq='H')
>>> print(resampled_ts.values())
[[0.]
[4.]]
>>> resampled_ts = ts.resample(freq="1h", offset=pd.Timedelta("30min"))
>>> downsampled_mean_ts = ts.resample(freq="30min", method="mean")
>>> print(downsampled_mean_ts.values())
[[0.5]
[2.5]
[4.5]]
>>> downsampled_reduce_ts = ts.resample(freq="30min", method="reduce", method_args={"func": np.mean})
>>> print(downsampled_reduce_ts.values())
[[0.5]
[2.5]
[4.5]]
rescale_with_value(value_at_first_step)[source]

Return a new series, which is a multiple of this series such that the first value is value_at_first_step.

Note: Numerical errors can appear with value_at_first_step > 1e+24.

Parameters

value_at_first_step (float) – The new value for the first entry of the TimeSeries.

Returns

A new series, where the first value is value_at_first_step and other values have been scaled accordingly.

Return type

TimeSeries

schema(copy=True)[source]

Return the schema of the series as a dictionary.

Can be used to create new TimeSeries with the same schema.

The keys and values are:

  • “time_freq”: the frequency (or step size) of the time (or range) index

  • “time_name”: the name of the time index

  • “columns”: the columns / components

  • “static_covariates”: the static covariates

  • “hierarchy”: the hierarchy

  • “metadata”: the metadata

Return type

dict[str, Any]

property shape: tuple[int, int, int]

The shape of the series (n_timesteps, n_components, n_samples).

Return type

tuple[int, int, int]

shift(n)[source]

Return a new series where the time index was shifted by n steps.

If \(n > 0\), shifts into the future. If \(n < 0\), shifts into the past.

For example, with \(n=2\) and freq=’M’, March 2013 becomes May 2013. With \(n=-2\), March 2013 becomes Jan 2013.

Parameters

n (int) – The number of time steps (in self.freq unit) to shift by. Can be negative.

Returns

A new series, with a shifted time index.

Return type

TimeSeries

skew(**kwargs)[source]

Return a deterministic series with the skew of each component computed over the samples of the stochastic series.

This works only on stochastic series (i.e., with more than 1 sample)

Parameters

kwargs – Other keyword arguments are passed down to scipy.stats.skew()

Returns

A new series containing the skew of each component.

Return type

TimeSeries

slice(start_ts, end_ts)[source]

Return a slice of the series starting at start_ts and ending before end_ts.

For series having DatetimeIndex, this is inclusive on both ends. For series having a RangeIndex, end_ts is exclusive.

start_ts and end_ts don’t have to be in the series.

Parameters
  • start_ts (Union[Timestamp, int]) – The timestamp that indicates the left cut-off.

  • end_ts (Union[Timestamp, int]) – The timestamp that indicates the right cut-off.

Returns

A new series, with indices greater or equal than start_ts and smaller or equal than end_ts.

Return type

TimeSeries

slice_intersect(other)[source]

Return a slice of the series where the time index was intersected with the other series.

This method is in general not symmetric.

Parameters

other (Self) – the other time series

Returns

A new series, containing the values of this series, over the time-span common to both series.

Return type

TimeSeries

slice_intersect_times(other, copy=True)[source]

Return the time index of the series where the time index was intersected with the other series.

This method is in general not symmetric.

Parameters
  • other (Self) – The other time series

  • copy (bool) – Whether to return a copy of the time index, otherwise returns a view. Leave it to True unless you know what you are doing.

Returns

The time index of this series, over the time-span common to both series.

Return type

Union[pandas.DatetimeIndex, pandas.RangeIndex]

slice_intersect_values(other, copy=False)[source]

Return the sliced values of the series where the time index was intersected with the other series.

This method is in general not symmetric.

Parameters
  • other (Self) – The other time series

  • copy (bool) – Whether to return a copy of the values, otherwise returns a view. Leave it to True unless you know what you are doing.

Returns

The values of this series, over the time-span common to both series.

Return type

numpy.ndarray

slice_n_points_after(start_ts, n)[source]

Return a slice of the series starting at start_ts (inclusive) and having at most n points.

Parameters
  • start_ts (Union[Timestamp, int]) – The timestamp or index that indicates the splitting time.

  • n (int) – The maximal length of the new TimeSeries.

Returns

A new series, with length at most n, starting at start_ts.

Return type

TimeSeries

slice_n_points_before(end_ts, n)[source]

Return a slice of the series ending at end_ts (inclusive) and having at most n points.

Parameters
  • end_ts (Union[Timestamp, int]) – The timestamp or index that indicates the splitting time.

  • n (int) – The maximal length of the new TimeSeries.

Returns

A new series, with length at most n, ending at start_ts.

Return type

TimeSeries

split_after(split_point)[source]

Split the series in two, after a provided split_point.

Parameters

split_point (Union[Timestamp, float, int]) – A timestamp, float or integer. If float, represents the proportion of the series to include in the first TimeSeries (must be between 0.0 and 1.0). If integer, represents the index position after which the split is performed. A pandas.Timestamp can be provided for TimeSeries that are indexed by a pandas.DatetimeIndex. In such cases, the timestamp will be contained in the first TimeSeries, but not in the second one. The timestamp itself does not have to appear in the original TimeSeries index.

Returns

A tuple of two series. The first time series contains the first entries up to the split_point (inclusive), and the second contains the remaining ones.

Return type

Tuple[TimeSeries, TimeSeries]

split_before(split_point)[source]

Split the series in two, before a provided split_point.

Parameters

split_point (Union[Timestamp, float, int]) – A timestamp, float or integer. If float, represents the proportion of the series to include in the first TimeSeries (must be between 0.0 and 1.0). If integer, represents the index position before which the split is performed. A pandas.Timestamp can be provided for TimeSeries that are indexed by a pandas.DatetimeIndex. In such cases, the timestamp will be contained in the second TimeSeries, but not in the first one. The timestamp itself does not have to appear in the original TimeSeries index.

Returns

A tuple of two series. The first time series contains the first entries up to the split_point (exclusive), and the second contains the remaining ones.

Return type

Tuple[TimeSeries, TimeSeries]

stack(other)[source]

Return a new series with the other series stacked to this series along the component axis.

The resulting TimeSeries will have the same name for its time dimension as this TimeSeries, and the same number of samples.

Parameters

other (Self) – A TimeSeries instance with the same index and the same number of samples as the current one.

Returns

A new series with the components of the other series added to the original.

Return type

TimeSeries

start_time()[source]

Start time of the series.

Returns

A timestamp containing the first time of the TimeSeries (if indexed by DatetimeIndex), or an integer (if indexed by RangeIndex)

Return type

Union[pandas.Timestamp, int]

property static_covariates: Optional[DataFrame]

The static covariates of this series.

If defined, the static covariates are given as a pandas.DataFrame. The columns represent the static variables and the rows represent the components of the series.

Return type

Optional[DataFrame, None]

static_covariates_values(copy=True)[source]

Return a 2-D array of dimension (component, static variable) containing the series’ static covariate values.

Parameters

copy (bool) – Whether to return a copy of the values, otherwise returns a view. Can only return a view if all values have the same dtype. Leave it to True unless you know what you are doing.

Returns

The static covariate values if the series has static covariates, else None.

Return type

Optional[numpy.ndarray]

std(ddof=1)[source]

Return a deterministic series with the standard deviation of each component computed over the samples of the stochastic series.

This works only on stochastic series (i.e., with more than 1 sample)

Parameters

ddof (int) – “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof where N represents the number of elements. By default, ddof is 1.

Returns

A new series containing the standard deviation of each component.

Return type

TimeSeries

strip(how='all')[source]

Return a slice of the deterministic time series where NaN-containing entries at the beginning and the end were removed.

No entries after (and including) the first non-NaN entry and before (and including) the last non-NaN entry are removed.

This method is only applicable to deterministic series (i.e., having 1 sample).

Parameters

how (str) – Define if the entries containing NaN in all the components (‘all’) or in any of the components (‘any’) should be stripped. Default: ‘all’

Returns

A new series where NaN-containing entries at start and end were removed.

Return type

TimeSeries

sum(axis=2)[source]

Return a new series with the sum computed over the specified axis.

If we reduce over time (axis=0), the series will have length one and will use the first entry of the original time_index. If we perform the calculation over the components (axis=1), the resulting single component will be renamed to “components_sum”. When applied to the samples (axis=2), a deterministic series is returned.

If axis=1, the static covariates and the hierarchy are discarded from the series.

Parameters

axis (int) – The axis to reduce over. The default is to calculate over samples, i.e. axis=2.

Returns

A new series with sum applied to the indicated axis.

Return type

TimeSeries

tail(size=5, axis=0)[source]

Return a new series with the last size points.

Parameters
  • size (int, default: 5) – number of points to retain

  • axis (str or int, optional, default: 0 (time dimension)) – axis along which we intend to display records

Returns

The series made of the last size points along the desired axis.

Return type

TimeSeries

property time_dim: str

The time dimension name of the series.

Return type

str

property time_index: Union[DatetimeIndex, RangeIndex]

The time index of the series.

Return type

Union[DatetimeIndex, RangeIndex]

to_csv(*args, **kwargs)[source]

Write the deterministic series to a CSV file.

For a list of parameters, refer to the documentation of pandas.DataFrame.to_csv() [1].

References

1

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html?highlight=to_csv

to_dataframe(copy=True, backend=Implementation.PANDAS, time_as_index=True, suppress_warnings=False)[source]

Return a DataFrame representation of the series in a given backend.

Each of the series components will appear as a column in the DataFrame. If the series is stochastic, the samples are returned as columns of the dataframe with column names as ‘component_s#’ (e.g. with two components and two samples: ‘comp0_s0’, ‘comp0_s1’ ‘comp1_s0’ ‘comp1_s1’).

Parameters
  • copy (bool) – Whether to return a copy of the dataframe. Leave it to True unless you know what you are doing.

  • backend (Union[module, Implementation, str]) –

    The backend to which to export the TimeSeries. See the narwhals documentation for all supported backends.

  • time_as_index (bool) – Whether to set the time index as the index of the dataframe or in the left-most column. Only effective with the pandas backend.

  • suppress_warnings (bool) – Whether to suppress the warnings for the DataFrame creation.

Returns

A DataFrame representation of the series in a given backend.

Return type

DataFrame

to_json()[source]

Return a JSON string representation of the deterministic series.

At the moment this function works only on deterministic time series (i.e., made of 1 sample).

Notes

Static covariates are not returned in the JSON string. When using TimeSeries.from_json(), the static covariates can be added with input argument static_covariates.

Returns

A JSON String representing the series

Return type

str

to_pickle(path, protocol=5)[source]

Save the series in pickle format.

Parameters
  • path (string) – path to a file where current object will be pickled

  • protocol (integer, default highest) – pickling protocol. The default is best in most cases, use it only if having backward compatibility issues

to_series(copy=True, backend=Implementation.PANDAS)[source]

Return a Series representation of the series in a given backend.

Works only for univariate series that are deterministic (i.e., made of 1 sample).

Parameters
  • copy (bool) – Whether to return a copy of the series. Leave it to True unless you know what you are doing.

  • backend (Union[module, Implementation, str]) –

    The backend to which to export the TimeSeries. See the narwhals documentation for all supported backends.

Return type

A Series representation of the series in a given backend.

property top_level_component: Optional[str]

The top level component name of this series, or None if the series has no hierarchy.

Return type

Optional[str, None]

property top_level_series: Optional[Self]

The univariate series containing the single top-level component of this series, or None if the series has no hierarchy.

Return type

Optional[Self, None]

univariate_component(index)[source]

Return a new univariate series with a selected component.

This drops the hierarchy (if any), and retains only the relevant static covariates column.

Parameters

index (Union[str, int]) – If a string, the name of the component to retrieve. If an integer, the positional index of the component.

Returns

A new series with a selected component.

Return type

TimeSeries

univariate_values(copy=True, sample=0)[source]

Return a 1-D Numpy array of shape (time,) containing the univariate series’ values for one sample.

Parameters
  • copy (bool) – Whether to return a copy of the values. Leave it to True unless you know what you are doing.

  • sample (int) – For stochastic series, the sample for which to return values. Default: 0 (first sample).

Returns

The values composing the time series guaranteed to be univariate.

Return type

numpy.ndarray

values(copy=True, sample=0)[source]

Return a 2-D array of shape (time, component), containing the series’ values for one sample.

Parameters
  • copy (bool) – Whether to return a copy of the values, otherwise returns a view. Leave it to True unless you know what you are doing.

  • sample (int) – For stochastic series, the sample for which to return values. Default: 0 (first sample).

Returns

The values composing the time series.

Return type

numpy.ndarray

var(ddof=1)[source]

Return a deterministic series with the variance of each component computed over the samples of the stochastic series.

This works only on stochastic series (i.e., with more than 1 sample)

Parameters

ddof (int) – “Delta Degrees of Freedom”: the divisor used in the calculation is N - ddof where N represents the number of elements. By default, ddof is 1.

Returns

A new series containing the variance of each component.

Return type

TimeSeries

property width: int

The width (number of components) of the series.

Return type

int

window_transform(transforms, treat_na=None, forecasting_safe=True, keep_non_transformed=False, include_current=True, keep_names=False)[source]

Return a new series with the specified window transformations applied.

Supports moving/rolling, expanding or exponentially weighted window transformations.

Parameters
  • transforms (Union[dict, Sequence[dict]]) –

    A dictionary or a list of dictionaries. Each dictionary specifies a different window transform.

    The dictionaries can contain the following keys:

    "function"

    Mandatory. The name of one of the pandas builtin transformation functions, or a callable function that can be applied to the input series. Pandas’ functions can be found in the documentation.

    "mode"

    Optional. The name of the pandas windowing mode on which the "function" is going to be applied. The options are “rolling”, “expanding” and “ewm”. If not provided, Darts defaults to “expanding”. User defined functions can use either “rolling” or “expanding” modes. More information on pandas windowing operations can be found in the documentation.

    "components"

    Optional. A string or list of strings specifying the TimeSeries components on which the transformation should be applied. If not specified, the transformation will be applied on all components.

    "function_name"

    Optional. A string specifying the function name referenced as part of the transformation output name. For example, given a user-provided function transformation on rolling window size of 5 on the component “comp”, the default transformation output name is “rolling_udf_5_comp” whereby “udf” refers to “user defined function”. If specified, the "function_name" will replace the default name “udf”. Similarly, the "function_name" will replace the name of the pandas builtin transformation function name in the output name.

    All other dictionary items provided will be treated as keyword arguments for the windowing mode (i.e., rolling/ewm/expanding) or for the specific function in that mode (i.e., pandas.DataFrame.rolling.mean/std/max/min... or pandas.DataFrame.ewm.mean/std/sum). This allows for more flexibility in configuring the transformation, by providing for example:

    • "window"

      Size of the moving window for the “rolling” mode. If an integer, the fixed number of observations used for each window. If an offset, the time period of each window with data type pandas.Timedelta representing a fixed duration.

    • "min_periods"

      The minimum number of observations in the window required to have a value (otherwise NaN). Darts reuses pandas defaults of 1 for “rolling” and “expanding” modes and of 0 for “ewm” mode.

    • "win_type"

      The type of weigthing to apply to the window elements. If provided, it should be one of scipy.signal.windows.

    • "center"

      True/False to set the observation at the current timestep at the center of the window (when forecasting_safe is True, Darts enforces "center" to False).

    • "closed"

      "right"/"left"/"both"/"neither" to specify whether the right, left or both ends of the window are included in the window, or neither of them. Darts defaults to pandas default of "right".

    More information on the available functions and their parameters can be found in the Pandas documentation.

    For user-provided functions, extra keyword arguments in the transformation dictionary are passed to the user-defined function. By default, Darts expects user-defined functions to receive numpy arrays as input. This can be modified by adding item "raw": False in the transformation dictionary. It is expected that the function returns a single value for each window. Other possible configurations can be found in the pandas.DataFrame.rolling().apply() documentation and pandas.DataFrame.expanding().apply() documentation.

  • treat_na (Union[str, int, float, None]) –

    Specifies how to treat missing values that were added by the window transformations at the beginning of the resulting TimeSeries. By default, Darts will leave NaNs in the resulting TimeSeries. This parameter can be one of the following:

    • "dropna"

      to truncate the TimeSeries and drop rows containing missing values. If multiple columns contain different numbers of missing values, only the minimum number of rows is dropped. This operation might reduce the length of the resulting TimeSeries.

    • "bfill" or "backfill"

      to specify that NaNs should be filled with the last transformed and valid observation. If the original TimeSeries starts with NaNs, those are kept. When forecasting_safe is True, this option returns an exception to avoid future observation contaminating the past.

    • an integer or float

      in which case NaNs will be filled with this value. All columns will be filled with the same provided value.

  • forecasting_safe (Optional[bool, None]) – If True, Darts enforces that the resulting TimeSeries is safe to be used in forecasting models as target or as feature. The window transformation will not allow future values to be included in the computations at their corresponding current timestep. Default is True. “ewm” and “expanding” modes are forecasting safe by default. “rolling” mode is forecasting safe if "center": False is guaranteed.

  • keep_non_transformed (Optional[bool, None]) – False to return the transformed components only, True to return all original components along the transformed ones. Default is False. If the series has a hierarchy, must be set to False.

  • include_current (Optional[bool, None]) – True to include the current time step in the window, False to exclude it. Default is True.

  • keep_names (Optional[bool, None]) – Whether the transformed components should keep the original component names or. Must be set to False if keep_non_transformed = True or the number of transformation is greater than 1.

Returns

Returns a new series with the transformed components. If keep_non_transformed is True, the series will contain the original non-transformed components along the transformed ones. If the input series is stochastic, all samples are identically transformed. The naming convention for the transformed components is as follows: [window_mode]_[function_name]_[window_size if provided]_[min_periods if not default]_[original_comp_name], e.g., rolling_sum_3_comp_0 (i.e., window_mode= rolling, function_name = sum, window_size=3, original_comp_name=comp_0) ; ewm_mean_comp_1 (i.e., window_mode= ewm, function_name = mean, original_comp_name=comp_1); expanding_sum_3_comp_2 (i.e., window_mode= expanding, function_name = sum, window_size=3, original_comp_name=comp_2). For user-defined functions, function_name = udf.

Return type

TimeSeries

with_columns_renamed(col_names, col_names_new)[source]

Return a new series with new columns/components names.

It also adapts the names in the hierarchy, if any.

Parameters
  • col_names (Union[list[str], str]) – String or list of strings corresponding the the column names to be changed.

  • col_names_new (Union[list[str], str]) – String or list of strings corresponding to the new column names. Must be the same length as col_names.

Returns

A new series with renamed columns.

Return type

TimeSeries

with_hierarchy(hierarchy)[source]

Return a new series with added hierarchy.

Parameters

hierarchy (dict[str, Union[str, list[str]]]) –

A dictionary mapping components to a list of their parent(s) in the hierarchy. Single parents may be specified as string or list containing one string. For example, assume the series contains the components ["total", "a", "b", "x", "y", "ax", "ay", "bx", "by"], the following dictionary would encode the groupings shown on this figure:

hierarchy = {'ax': ['a', 'x'],
             'ay': ['a', 'y'],
             'bx': ['b', 'x'],
             'by': ['b', 'y'],
             'a': ['total'],
             'b': ['total'],
             'x': 'total',  # or use a single string
             'y': 'total'}

Returns

A new series with the given hierarchy.

Return type

TimeSeries

with_metadata(metadata)[source]

Return a new series with added metadata.

Parameters

metadata (Optional[dict, None]) – A dictionary with metadata to be added to the TimeSeries.

Returns

A new series with the given metadata.

Return type

TimeSeries

Examples

>>> from darts.utils.timeseries_generation import linear_timeseries
>>> series = linear_timeseries(length=3)
>>> # add metadata
>>> metadata = {'name': 'my_series'}
>>> series = series.with_metadata(metadata)
>>> series.metadata
{'name': 'my_series'}
with_static_covariates(covariates)[source]

Return a new series with added static covariates.

Static covariates hold information / data about the time series which does not vary over time.

Parameters

covariates (Union[Series, DataFrame, None]) – Optionally, a set of static covariates to be added to the TimeSeries. Either a pandas Series, a pandas DataFrame, or None. If None, will set the static covariates to None. If a Series, the index represents the static variables. The covariates are then globally ‘applied’ to all components of the TimeSeries. If a DataFrame, the columns represent the static variables and the rows represent the components of the uni/multivariate TimeSeries. If a single-row DataFrame, the covariates are globally ‘applied’ to all components of the TimeSeries. If a multi-row DataFrame, the number of rows must match the number of components of the TimeSeries. This adds component-specific static covariates.

Returns

A new series with the given static covariates.

Return type

TimeSeries

Notes

If there are a large number of static covariates variables (i.e., the static covariates have a very large dimension), there might be a noticeable performance penalty for creating TimeSeries, unless the covariates already have the same dtype as the series data.

Examples

>>> import pandas as pd
>>> from darts.utils.timeseries_generation import linear_timeseries
>>> # add global static covariates
>>> static_covs = pd.Series([0., 1.], index=["static_cov_1", "static_cov_2"])
>>> series = linear_timeseries(length=3)
>>> series_new1 = series.with_static_covariates(static_covs)
>>> series_new1.static_covariates
                   static_cov_1  static_cov_2
component
linear              0.0           1.0
>>> # add component specific static covariates
>>> static_covs_multi = pd.DataFrame([[0., 1.], [2., 3.]], columns=["static_cov_1", "static_cov_2"])
>>> series_multi = series.stack(series)
>>> series_new2 = series_multi.with_static_covariates(static_covs_multi)
>>> series_new2.static_covariates
                   static_cov_1  static_cov_2
component
linear              0.0           1.0
linear_1            2.0           3.0
with_times_and_values(times, values, fill_missing_dates=False, freq=None, fillna_value=None)[source]

Return a new series similar to this one but with new times and values.

Parameters
  • times (Union[DatetimeIndex, RangeIndex, Index]) – A pandas DateTimeIndex, RangeIndex (or Index that can be converted to a RangeIndex) representing the new time axis for the time series. It is better if the index has no holes; alternatively setting fill_missing_dates can in some cases solve these issues (filling holes with NaN, or with the provided fillna_value numeric value, if any).

  • values (ndarray) – A Numpy array with new values. It must have the dimensions for times and components, but may contain a different number of samples.

  • fill_missing_dates (Optional[bool, None]) – Optionally, a boolean value indicating whether to fill missing dates (or indices in case of integer index) with NaN values. This requires either a provided freq or the possibility to infer the frequency from the provided timestamps. See _fill_missing_dates() for more info.

  • freq (Union[str, int, None]) –

    Optionally, a string or integer representing the frequency of the underlying index. This is useful in order to fill in missing values if some dates are missing and fill_missing_dates is set to True. If a string, represents the frequency of the pandas DatetimeIndex (see offset aliases for more info on supported frequencies). If an integer, represents the step size of the pandas Index or pandas RangeIndex.

  • fillna_value (Optional[float, None]) – Optionally, a numeric value to fill missing values (NaNs) with.

Returns

A new series with the new time index and values but identical static covariates and hierarchy.

Return type

TimeSeries

with_values(values)[source]

Return a new series similar to this one but with new values.

Parameters

values (ndarray) – A Numpy array with new values. It must have the dimensions for time and components, but may contain a different number of samples.

Returns

A new series with the new values but same index, static covariates and hierarchy

Return type

TimeSeries

darts.timeseries.concatenate(series, axis=0, ignore_time_axis=False, ignore_static_covariates=False, drop_hierarchy=True, drop_metadata=False)[source]

Concatenate multiple series along a given axis.

axis can be an integer in (0, 1, 2) to denote (time, component, sample) or, alternatively, a string denoting the corresponding dimension of the underlying DataArray.

Parameters
  • series (Sequence[TimeSeries]) – Sequence of TimeSeries to concatenate.

  • axis (Union[str, int]) – Axis along which the series will be concatenated.

  • ignore_time_axis (bool) – Allow concatenation even when some series do not have matching time axes. When done along component or sample dimensions, concatenation will work as long as the series have the same lengths (in this case the resulting series will have the time axis of the first provided series). When done along time dimension, concatenation will work even if the time axes are not contiguous (in this case, the resulting series will have a start time matching the start time of the first provided series). Default: False.

  • ignore_static_covariates (bool) – Whether to ignore all requirements for static covariate concatenation and only transfer the static covariates of the first TimeSeries element in series to the concatenated TimeSeries. Only effective when axis=1.

  • drop_hierarchy (bool) – When axis=1, whether to drop hierarchy information. True by default. When False, the hierarchies will be “concatenated” as well (by merging the hierarchy dictionaries), which may cause issues if the component names of the resulting series and that of the merged hierarchy do not match. When axis=0 or axis=2, the hierarchy of the first series is always kept.

  • drop_metadata (bool) – Whether to drop the metadata information of the concatenated series. False by default. When False, the concatenated series will inherit the metadata from the first TimeSeries element in series.

Returns

The concatenated series.

Return type

TimeSeries

darts.timeseries.slice_intersect(series)[source]

Return a list of series, where all series have been intersected along the time index.

Parameters

series (Sequence[TimeSeries]) – sequence of TimeSeries to intersect

Returns

The intersected series.

Return type

Sequence[TimeSeries]