Datasets

A few popular time series datasets

class darts.datasets.AirPassengersDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Monthly Air Passengers Dataset, from 1949 to 1960.

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.AusBeerDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Total quarterly beer production in Australia (in megalitres) from 1956:Q1 to 2008:Q3 [1].

References

1

https://rdrr.io/cran/fpp/man/ausbeer.html

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.AustralianTourismDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

A single multivariate TimeSeries, containing monthly tourism numbers over 36 months in Australia. The numbers are broken down per region (“NSW”, “VIC”, “QLD”, “SA”, “WA”, “TAS”, “NT”), reason (“Hol”, “VFR”, “Bus”, “Oth”), (region, reason) pairs, and (region, reason, <city>) tuples, where <city> can be either “city” or “noncity”.

This is an augmented version of the Australian tourism dataset available in [1], where we pre-computed the groupings per region (not available in the original dataset).

References

1

https://robjhyndman.com/publications/hierarchical-tourism/

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.ETTh1Dataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 taken hourly. Source: [1][R3006dd7a64b3-2]_

Field Descriptions: date: The recorded date HUFL: High UseFul Load HULL: High UseLess Load MUFL: Medium UseFul Load MULL: Medium UseLess Load LUFL: Low UseFul Load LULL: Low UseLess Load OT: Oil Temperature (Target)

References

1

https://github.com/zhouhaoyi/ETDataset

2

https://arxiv.org/abs/2012.07436

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.ETTh2Dataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 taken hourly. Source: [1][R1f0974863715-2]_

Field Descriptions: date: The recorded date HUFL: High UseFul Load HULL: High UseLess Load MUFL: Medium UseFul Load MULL: Medium UseLess Load LUFL: Low UseFul Load LULL: Low UseLess Load OT: Oil Temperature (Target)

References

1

https://github.com/zhouhaoyi/ETDataset

2

https://arxiv.org/abs/2012.07436

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.ETTm1Dataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 recorded every 15 minutes. Source: [1][Rc373e99cf30b-2]_

Field Descriptions: date: The recorded date HUFL: High UseFul Load HULL: High UseLess Load MUFL: Medium UseFul Load MULL: Medium UseLess Load LUFL: Low UseFul Load LULL: Low UseLess Load OT: Oil Temperature (Target)

References

1

https://github.com/zhouhaoyi/ETDataset

2

https://arxiv.org/abs/2012.07436

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.ETTm2Dataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 recorded every 15 minutes. Source: [1][Rb66d6fdaeb78-2]_

Field Descriptions: date: The recorded date HUFL: High UseFul Load HULL: High UseLess Load MUFL: Medium UseFul Load MULL: Medium UseLess Load LUFL: Low UseFul Load LULL: Low UseLess Load OT: Oil Temperature (Target)

References

1

https://github.com/zhouhaoyi/ETDataset

2

https://arxiv.org/abs/2012.07436

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.ElectricityDataset(multivariate=True)[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Measurements of electric power consumption in one household with 15 minute sampling rate. 370 client’s consumption are recorded in kW. Source: [1]

Loading this dataset will provide a multivariate timeseries with 370 columns for each household. The following code can be used to convert the dataset to a list of univariate timeseries, one for each household.

References

1

https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014

Methods

load()

Load the dataset in memory, as a TimeSeries.

Parameters

multivariate (bool) – Whether to return a single multivariate timeseries - if False returns a list of univariate TimeSeries. Default is True.

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.EnergyDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Hourly energy dataset coming from [1].

Contains a time series with 28 hourly components between 2014-12-31 23:00:00 and 2018-12-31 22:00:00

References

1

https://www.kaggle.com/nicholasjhana/energy-consumption-generation-prices-and-weather

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.GasRateCO2Dataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Gas Rate CO2 dataset Two components, length 296 (integer time index)

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.HeartRateDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

The series contains 1800 evenly-spaced measurements of instantaneous heart rate from a single subject. The measurements (in units of beats per minute) occur at 0.5 second intervals, so that the length of each series is exactly 15 minutes.

This is the series1 in [1]. It uses an integer time index.

References

1

http://ecg.mit.edu/time-series/

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.IceCreamHeaterDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Monthly sales of heaters and ice cream between January 2004 and June 2020.

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.MonthlyMilkDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Monthly production of milk (in pounds per cow) between January 1962 and December 1975

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.MonthlyMilkIncompleteDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Monthly production of milk (in pounds per cow) between January 1962 and December 1975. Has some missing values.

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.SunspotsDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Monthly Sunspot Numbers, 1749 - 1983

Monthly mean relative sunspot numbers from 1749 to 1983. Collected at Swiss Federal Observatory, Zurich until 1960, then Tokyo Astronomical Observatory.

Source: [1]

References

1

https://www.rdocumentation.org/packages/datasets/versions/3.6.1/topics/sunspots

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.TaylorDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Half-hourly electricity demand in England and Wales from Monday 5 June 2000 to Sunday 27 August 2000. Discussed in Taylor (2003) [1], and kindly provided by James W Taylor [2]. Units: Megawatts (Uses an integer time index).

References

1

Taylor, J.W. (2003) Short-term electricity demand forecasting using double seasonal exponential smoothing. Journal of the Operational Research Society, 54, 799-805.

2

https://www.rdocumentation.org/packages/forecast/versions/8.13/topics/taylor

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.TemperatureDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Daily temperature in Melbourne between 1981 and 1990

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.USGasolineDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Weekly U.S. Product Supplied of Finished Motor Gasoline between 1991-02-08 and 2021-04-30

Obtained from [1].

References

1

https://www.eia.gov/dnav/pet/hist/LeafHandler.ashx?n=PET&s=wgfupus2&f=W

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.UberTLCDataset(sample_freq='hourly', multivariate=True)[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

14.3 million Uber pickups from January to June 2015. The data is resampled to hourly or daily based sample_freq on using the locationID as the target. Source: [1]

Loading this dataset will provide a multivariate timeseries with 262 columns for each locationID. The following code can be used to convert the dataset to a list of univariate timeseries, one for each locationID.

References

1

https://github.com/fivethirtyeight/uber-tlc-foil-response

Methods

load()

Load the dataset in memory, as a TimeSeries.

Parameters
  • sample_freq (str) – The sampling frequency of the data. Can be “hourly” or “daily”. Default is “hourly”.

  • multivariate (bool) – Whether to return a single multivariate timeseries - if False returns a list of univariate TimeSeries. Default is True.

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.WineDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Australian total wine sales by wine makers in bottles <= 1 litre. Monthly between Jan 1980 and Aug 1994. Source: [1]

References

1

https://www.rdocumentation.org/packages/forecast/versions/8.1/topics/wineind

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries

class darts.datasets.WoolyDataset[source]

Bases: darts.datasets.dataset_loaders.DatasetLoaderCSV

Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994. Source: [1]

References

1

https://www.rdocumentation.org/packages/forecast/versions/8.1/topics/woolyrnq

Methods

load()

Load the dataset in memory, as a TimeSeries.

load()

Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already

Raises

DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)

Returns

time_series – A TimeSeries object that contains the dataset

Return type

TimeSeries