Datasets¶
A few popular time series datasets
- class darts.datasets.AirPassengersDataset[source]¶
Bases:
DatasetLoaderCSV
Monthly Air Passengers Dataset, from 1949 to 1960.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.AusBeerDataset[source]¶
Bases:
DatasetLoaderCSV
Total quarterly beer production in Australia (in megalitres) from 1956:Q1 to 2008:Q3 [1].
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.AustralianTourismDataset[source]¶
Bases:
DatasetLoaderCSV
A single multivariate TimeSeries, containing monthly tourism numbers over 36 months in Australia. The numbers are broken down per region (“NSW”, “VIC”, “QLD”, “SA”, “WA”, “TAS”, “NT”), reason (“Hol”, “VFR”, “Bus”, “Oth”), (region, reason) pairs, and (region, reason, <city>) tuples, where <city> can be either “city” or “noncity”.
This is an augmented version of the Australian tourism dataset available in [1], where we pre-computed the groupings per region (not available in the original dataset).
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ETTh1Dataset[source]¶
Bases:
DatasetLoaderCSV
The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 taken hourly. Source: [1] [2]
Field Descriptions:
date: The recorded date
HUFL: High UseFul Load
HULL: High UseLess Load
MUFL: Medium UseFul Load
MULL: Medium UseLess Load
LUFL: Low UseFul Load
LULL: Low UseLess Load
OT: Oil Temperature (Target)
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ETTh2Dataset[source]¶
Bases:
DatasetLoaderCSV
The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 taken hourly. Source: [1] [2]
Field Descriptions:
date: The recorded date
HUFL: High UseFul Load
HULL: High UseLess Load
MUFL: Medium UseFul Load
MULL: Medium UseLess Load
LUFL: Low UseFul Load
LULL: Low UseLess Load
OT: Oil Temperature (Target)
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ETTm1Dataset[source]¶
Bases:
DatasetLoaderCSV
The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 recorded every 15 minutes. Source: [1] [2]
Field Descriptions:
date: The recorded date
HUFL: High UseFul Load
HULL: High UseLess Load
MUFL: Medium UseFul Load
MULL: Medium UseLess Load
LUFL: Low UseFul Load
LULL: Low UseLess Load
OT: Oil Temperature (Target)
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ETTm2Dataset[source]¶
Bases:
DatasetLoaderCSV
The data of 1 Electricity Transformers at 1 stations, including load, oil temperature. The dataset ranges from 2016/07 to 2018/07 recorded every 15 minutes. Source: [1] [2]
Field Descriptions:
date: The recorded date
HUFL: High UseFul Load
HULL: High UseLess Load
MUFL: Medium UseFul Load
MULL: Medium UseLess Load
LUFL: Low UseFul Load
LULL: Low UseLess Load
OT: Oil Temperature (Target)
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ElectricityConsumptionZurichDataset[source]¶
Bases:
DatasetLoaderCSV
Electricity Consumption of households & SMEs (low voltage) and businesses & services (medium voltage) in the city of Zurich [1], with values recorded every 15 minutes.
The electricity consumption is combined with weather measurements recorded by three different stations in the city of Zurich with a hourly frequency [2]. The missing time stamps are filled with NaN. The original weather data is recorded every hour. Before adding the features to the electricity consumption, the data is resampled to 15 minutes frequency, and missing values are interpolated.
To simplify the dataset, the measurements from the Zch_Schimmelstrasse and Zch_Rosengartenstrasse weather stations are discarded to keep only the data recorded in the Zch_Stampfenbachstrasse station.
Both dataset sources are updated continuously, but this dataset only retrains values between 2015-01-01 and 2022-08-31. The time index was converted from CET time zone to UTC.
Components Descriptions:
Value_NE5 : Households & SMEs electricity consumption (low voltage, grid level 7) in kWh
Value_NE7 : Business and services electricity consumption (medium voltage, grid level 5) in kWh
Hr [%Hr] : Relative humidity
RainDur [min] : Duration of precipitation (divided by 4 for conversion from hourly to quarter-hourly records)
T [°C] : Temperature
WD [°] : Wind direction
WVv [m/s] : Wind vector speed
p [hPa] : Air pressure
WVs [m/s] : Wind scalar speed
StrGlo [W/m2] : Global solar irradiation
Note: before 2018, the scalar speeds were calculated from the 30 minutes vector data.
References
- 1
https://data.stadt-zuerich.ch/dataset/ewz_stromabgabe_netzebenen_stadt_zuerich
- 2
https://data.stadt-zuerich.ch/dataset/ugz_meteodaten_stundenmittelwerte
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ElectricityDataset(multivariate=True)[source]¶
Bases:
DatasetLoaderCSV
Measurements of electric power consumption in one household with 15 minute sampling rate. 370 client’s consumption are recorded in kW. Source: [1]
Loading this dataset will provide a multivariate timeseries with 370 columns for each household. The following code can be used to convert the dataset to a list of univariate timeseries, one for each household.
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- Parameters
multivariate (bool) – Whether to return a single multivariate timeseries - if False returns a list of univariate TimeSeries. Default is True.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.EnergyDataset[source]¶
Bases:
DatasetLoaderCSV
Hourly energy dataset coming from [1].
Contains a time series with 28 hourly components between 2014-12-31 23:00:00 and 2018-12-31 22:00:00
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ExchangeRateDataset(multivariate=True)[source]¶
Bases:
DatasetLoaderCSV
The collection of the daily exchange rates of eight foreign countries, including Australia, British, Canada, Switzerland, China, Japan, New Zealand, and Singapore, ranging from 1990 to 2016. Unfortunately, there were some inconsistencies concerning the dates, so the resulting TimeSeries is integer-indexed. Source: [1]
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- Parameters
multivariate (bool) – Whether to return a single multivariate timeseries - if False returns a list of univariate TimeSeries. Default is True.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.GasRateCO2Dataset[source]¶
Bases:
DatasetLoaderCSV
Gas Rate CO2 dataset Two components, length 296 (integer time index)
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.HeartRateDataset[source]¶
Bases:
DatasetLoaderCSV
The series contains 1800 evenly-spaced measurements of instantaneous heart rate from a single subject. The measurements (in units of beats per minute) occur at 0.5 second intervals, so that the length of each series is exactly 15 minutes.
This is the series1 in [1]. It uses an integer time index.
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.ILINetDataset(multivariate=True)[source]¶
Bases:
DatasetLoaderCSV
ILI describes the number of patients seen with influenzalike illness and the total number of patients. It includes weekly data from the Centers for Disease Control and Prevention of the United States from 1997 to 2022. Source: [1] [2] [3] [4]
Components Descriptions:
- % WEIGHTED ILI: Combined state-specific data of patients visit to healthcare providers for ILI reported each week
weighted by state population
- % UNWEIGHTED ILI: Combined state-specific data of patients visit to healthcare providers for ILI reported each
week unweighted by state population
AGE 0-4: Number of patients between 0 and 4 years of age
AGE 25-49: Number of patients between 25 and 49 years of age
AGE 25-64: Number of patients between 25 and 64 years of age
AGE 5-24: Number of patients between 5 and 24 years of age
AGE 50-64: Number of patients between 50 and 64 years of age
AGE 65: Number of patients above (>=65) 65 years of age
- ILITOTAL: Total number of ILI patients. For this system, ILI is defined as fever (temperature of 100°F [37.8°C]
or greater) and a cough and/or a sore throat
NUM. OF PROVIDERS: Number of outpatient healthcare providers
TOTAL PATIENTS: Total number of patients
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.IceCreamHeaterDataset[source]¶
Bases:
DatasetLoaderCSV
Monthly sales of heaters and ice cream between January 2004 and June 2020.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.MonthlyMilkDataset[source]¶
Bases:
DatasetLoaderCSV
Monthly production of milk (in pounds per cow) between January 1962 and December 1975
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.MonthlyMilkIncompleteDataset[source]¶
Bases:
DatasetLoaderCSV
Monthly production of milk (in pounds per cow) between January 1962 and December 1975. Has some missing values.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.SunspotsDataset[source]¶
Bases:
DatasetLoaderCSV
Monthly Sunspot Numbers, 1749 - 1983
Monthly mean relative sunspot numbers from 1749 to 1983. Collected at Swiss Federal Observatory, Zurich until 1960, then Tokyo Astronomical Observatory.
Source: [1]
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.TaxiNewYorkDataset[source]¶
Bases:
DatasetLoaderCSV
Taxi Passengers in New York, from 2014-07 to 2015-01. The data consists of aggregated total number of taxi passengers into 30 minute buckets. Univariate series. Source: [1]
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.TaylorDataset[source]¶
Bases:
DatasetLoaderCSV
Half-hourly electricity demand in England and Wales from Monday 5 June 2000 to Sunday 27 August 2000. Discussed in Taylor (2003) [1], and kindly provided by James W Taylor [2]. Units: Megawatts (Uses an integer time index).
References
- 1
Taylor, J.W. (2003) Short-term electricity demand forecasting using double seasonal exponential smoothing. Journal of the Operational Research Society, 54, 799-805.
- 2
https://www.rdocumentation.org/packages/forecast/versions/8.13/topics/taylor
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.TemperatureDataset[source]¶
Bases:
DatasetLoaderCSV
Daily temperature in Melbourne between 1981 and 1990
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.TrafficDataset(multivariate=True)[source]¶
Bases:
DatasetLoaderCSV
The data in this repo is a collection of 48 months (2015-2016) hourly data from the California Department of Transportation. The data describes the road occupancy rates (between 0 and 1) measured by 862 different sensors on San Francisco Bay area freeways. The raw data is in http://pems.dot.ca.gov. Source: [1]
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- Parameters
multivariate (bool) – Whether to return a single multivariate timeseries - if False returns a list of univariate TimeSeries. Default is True.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.USGasolineDataset[source]¶
Bases:
DatasetLoaderCSV
Weekly U.S. Product Supplied of Finished Motor Gasoline between 1991-02-08 and 2021-04-30
Obtained from [1].
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.UberTLCDataset(sample_freq='hourly', multivariate=True)[source]¶
Bases:
DatasetLoaderCSV
14.3 million Uber pickups from January to June 2015. The data is resampled to hourly or daily based sample_freq on using the locationID as the target. Source: [1]
Loading this dataset will provide a multivariate timeseries with 262 columns for each locationID. The following code can be used to convert the dataset to a list of univariate timeseries, one for each locationID.
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- Parameters
sample_freq (str) – The sampling frequency of the data. Can be “hourly” or “daily”. Default is “hourly”.
multivariate (bool) – Whether to return a single multivariate timeseries - if False returns a list of univariate TimeSeries. Default is True.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.WeatherDataset(multivariate=True)[source]¶
Bases:
DatasetLoaderCSV
Weather includes 21 indicators of weather, such as air temperature, and humidity. The data was recorded every 10 min for 2020 in Germany. Source: [1] [2]
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- Parameters
multivariate (bool) – Whether to return a single multivariate timeseries - if False returns a list of univariate TimeSeries. Default is True.
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.WineDataset[source]¶
Bases:
DatasetLoaderCSV
Australian total wine sales by wine makers in bottles <= 1 litre. Monthly between Jan 1980 and Aug 1994. Source: [1]
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type
- class darts.datasets.WoolyDataset[source]¶
Bases:
DatasetLoaderCSV
Quarterly production of woollen yarn in Australia: tonnes. Mar 1965 – Sep 1994. Source: [1]
References
Methods
load
()Load the dataset in memory, as a TimeSeries.
- load()¶
Load the dataset in memory, as a TimeSeries. Downloads the dataset if it is not present already
- Raises
DatasetLoadingException – If loading fails (MD5 Checksum is invalid, Download failed, Reading from disk failed)
- Returns
time_series – A TimeSeries object that contains the dataset
- Return type