Section Navigation

▼ Api
- Index
- Activations
- Dataloaders
- Dropouts
- Losses
- Normalizations
- Optimizations
- Regularizations
- Trainers
- Utils
- ▼ Datasets
- ▼ Models
- ▼ Transforms
▼ Comparisons
- Comparison
▼ Examples
- Index
- ▼ Audio
  - ▼ Audio classification
    Environmental sounds
    Music genre
  - ▼ Speech recognition
    E2e ctc
    Keyword spotting
- ▼ Computer vision
  - ▼ Image classification
    Finetuning resnet cifar10
    Lenet mnist
    Transfer learning custom
  - ▼ Image generation
    Cyclegan
    Dcgan
  - ▼ Object detection
    Faster rcnn
    Yolov3 coco
  - ▼ Semantic segmentation
    Deeplab v3
    Mask rcnn
- ▼ Data handling
  - ▼ Dataloaders
    Efficient loading
  - ▼ Datasets
    Builtin datasets
    Custom datasets
  - ▼ Transforms
    Image augmentation
- ▼ Deployment
  - ▼ Inference
    Cpp app
    Tensorrt
  - ▼ Serialization
    Export torchscript
    Save load
  - ▼ Web services
    Rest api
- ▼ Distributed
  - ▼ Data parallelism
    Multi gpu
  - ▼ Model parallelism
    Model splitting
  - ▼ Multi machine
    Setup
- ▼ Generative
  - ▼ Autoencoders
    Denoising ae
    Vae
  - ▼ Diffusion
    Ddpm
  - ▼ Gans
    Mnist gan
    Progressive gan
- ▼ Getting started
- ▼ Gnn
  - ▼ Graph level
    Diffpool
    Mpnn
  - ▼ Node level
    Gcn
    Graphsage
- ▼ Nlp
  - ▼ Language modeling
    Finetuning bert
    Training gpt
  - ▼ Seq2seq
    Machine translation
    Summarization
  - ▼ Text classification
    Sentiment rnn
    Transformer classification
- ▼ Optimization
  - ▼ Lr schedulers
    Cosine annealing
    Step decay
  - ▼ Optimizers
    Adamw
    Sgd momentum
  - ▼ Regularization
    Dropout
    Weight decay
- ▼ Performance
  - ▼ Memory
    Data loading
    Gradient checkpointing
  - ▼ Speed
    Mixed precision
    Profiling
- ▼ Rl
  - ▼ Policy based
    Ppo
    Reinforce
  - ▼ Value based
    Dqn atari
    Q learning
- ▼ Time series
  - ▼ Anomaly detection
    Autoencoders
  - ▼ Forecasting
    Lstm
    Multivariate
▼ Getting started
- Installation
- Quick start cnn
▼ User guide

Time Series Datasets

Time-series data consists of sequences of data points indexed in time order. It is a fundamental data type in many domains, including finance, weather forecasting, and sensor data analysis. Common tasks include forecasting future values, classifying sequences, and detecting anomalies.

xTorch provides handlers for popular time-series datasets to facilitate research and development in this area. These datasets are located under the xt::datasets namespace and can be found in the <xtorch/datasets/time_series/> header directory.

General Usage

Working with time-series data often involves specific preprocessing steps like creating sliding windows of data, normalization, and feature engineering. These can be applied using xTorch's Transform pipeline.

The general workflow involves instantiating the dataset, which handles loading the raw sequences, and then passing it to a DataLoader for batching.

#include <xtorch/xtorch.h>
#include <iostream>
 
int main() {
    // 1. Define any necessary transformations (optional)
    // For example, normalizing the time-series values.
    // auto transforms = std::make_unique<xt::transforms::...>();
 
    // 2. Instantiate a dataset for the M4 Forecasting Competition.
    auto dataset = xt::datasets::M4Competition(
        "./data",
        /*download=*/true
        // std::move(transforms)
    );
 
    std::cout << "M4 Competition dataset loaded." << std::endl;
    std::cout << "Number of time series: " << *dataset.size() << std::endl;
 
    // 3. Pass the dataset to a DataLoader
    // Batching time-series data can depend heavily on the model architecture (e.g., RNNs vs Transformers).
    xt::dataloaders::ExtendedDataLoader data_loader(dataset, 32, true);
 
    // The data loader is now ready for use in a training loop
    for (auto& batch : data_loader) {
        auto history = batch.first;   // Input sequence
        auto future = batch.second;  // Sequence to predict
        // ... training step with a forecasting model like an LSTM or Informer ...
    }
}

!!! info "Data Structure" The exact structure of the data returned by get(index) can vary depending on the dataset and task. For forecasting, it's often a pair of tensors representing historical context and future values. For classification, it's a sequence and a single class label.

Available Datasets by Task

Time Series Forecasting

The task of predicting future values in a sequence given past values.

Dataset Class	Description	Header File
`M4Competition`	The dataset from the 4th Makridakis Forecasting Competition, containing a large and diverse set of time series from different domains.	`time_series_forecasting/m4_competition.h`
`ElectricityLoadDiagrams`	The ElectricityLoadDiagrams20112014 Data Set, which contains the electricity consumption of 370 clients.	`time_series_forecasting/electricity_load_diagrams.h`

Time Series Classification

The task of assigning a categorical label to an entire time-series sequence.

Dataset Class	Description	Header File
`UCRTimeSeriesArchive`	A large collection of datasets from the UCR Time Series Classification Archive, widely used for benchmarking classification algorithms.	`time_series_classification/ucr_time_series_archive.h`

Anomaly Detection

The task of identifying rare items, events, or observations which raise suspicions by differing significantly from the majority of the data.

Dataset Class	Description	Header File
`NAB`	The Numenta Anomaly Benchmark, a benchmark for evaluating algorithms for streaming anomaly detection.	`anomaly_detection/nab.h`