Section Navigation

▼ Api
- Index
- Activations
- Dataloaders
- Dropouts
- Losses
- Normalizations
- Optimizations
- Regularizations
- Trainers
- Utils
- ▼ Datasets
- ▼ Models
- ▼ Transforms
▼ Comparisons
- Comparison
▼ Examples
- Index
- ▼ Audio
  - ▼ Audio classification
    Environmental sounds
    Music genre
  - ▼ Speech recognition
    E2e ctc
    Keyword spotting
- ▼ Computer vision
  - ▼ Image classification
    Finetuning resnet cifar10
    Lenet mnist
    Transfer learning custom
  - ▼ Image generation
    Cyclegan
    Dcgan
  - ▼ Object detection
    Faster rcnn
    Yolov3 coco
  - ▼ Semantic segmentation
    Deeplab v3
    Mask rcnn
- ▼ Data handling
  - ▼ Dataloaders
    Efficient loading
  - ▼ Datasets
    Builtin datasets
    Custom datasets
  - ▼ Transforms
    Image augmentation
- ▼ Deployment
  - ▼ Inference
    Cpp app
    Tensorrt
  - ▼ Serialization
    Export torchscript
    Save load
  - ▼ Web services
    Rest api
- ▼ Distributed
  - ▼ Data parallelism
    Multi gpu
  - ▼ Model parallelism
    Model splitting
  - ▼ Multi machine
    Setup
- ▼ Generative
  - ▼ Autoencoders
    Denoising ae
    Vae
  - ▼ Diffusion
    Ddpm
  - ▼ Gans
    Mnist gan
    Progressive gan
- ▼ Getting started
- ▼ Gnn
  - ▼ Graph level
    Diffpool
    Mpnn
  - ▼ Node level
    Gcn
    Graphsage
- ▼ Nlp
  - ▼ Language modeling
    Finetuning bert
    Training gpt
  - ▼ Seq2seq
    Machine translation
    Summarization
  - ▼ Text classification
    Sentiment rnn
    Transformer classification
- ▼ Optimization
  - ▼ Lr schedulers
    Cosine annealing
    Step decay
  - ▼ Optimizers
    Adamw
    Sgd momentum
  - ▼ Regularization
    Dropout
    Weight decay
- ▼ Performance
  - ▼ Memory
    Data loading
    Gradient checkpointing
  - ▼ Speed
    Mixed precision
    Profiling
- ▼ Rl
  - ▼ Policy based
    Ppo
    Reinforce
  - ▼ Value based
    Dqn atari
    Q learning
- ▼ Time series
  - ▼ Anomaly detection
    Autoencoders
  - ▼ Forecasting
    Lstm
    Multivariate
▼ Getting started
- Installation
- Quick start cnn
▼ User guide

Regularization Techniques

Regularization refers to a collection of techniques designed to prevent a model from overfitting the training data. By adding a penalty for model complexity, regularization helps the model generalize better to unseen data.

Standard Regularization in LibTorch

The most common forms of regularization are readily available when using LibTorch and are standard practice in deep learning.

Weight Decay (L2 Regularization): This is the most common technique. It is not a separate module but rather an option built directly into optimizers. You can enable it by setting the weight_decay parameter in the optimizer's options.
```
// Enable weight decay in the Adam optimizer
torch::optim::Adam optimizer(
    model.parameters(),
    torch::optim::AdamOptions(1e-3).weight_decay(1e-4) // L2 penalty
);
```
Dropout: This technique randomly zeroes out activations during training. It is a powerful regularizer implemented as a set of modules. See the dedicated Dropouts page for a comprehensive list of variants.

xTorch Extended Regularization Techniques

Beyond weight decay and dropout, there is a wide range of explicit regularization methods that can be applied to activations, weights, or the loss function itself. xTorch provides a rich collection of these techniques, allowing for advanced experimentation.

Usage

Most xTorch regularization techniques are implemented as torch::nn::Modules and are located in the xt::regulariztions namespace. They can be applied in your model's forward pass or used to wrap your loss function.

A common example is LabelSmoothing, which helps prevent the model from becoming overconfident in its predictions.

#include <xtorch/xtorch.hh>
 
int main() {
    // Assume we have model outputs and targets
    auto logits = torch::randn({16, 10}); // Raw outputs (logits)
    auto targets = torch::randint(0, 10, {16});
 
    // 1. Define a standard loss function
    torch::nn::CrossEntropyLoss cross_entropy_loss;
 
    // 2. Instantiate the LabelSmoothing regularizer
    // Epsilon is the smoothing factor.
    xt::regulariztions::LabelSmoothing label_smoother(
        xt::regulariztions::LabelSmoothingOptions(0.1) // 10% smoothing
    );
 
    // 3. Apply label smoothing to the loss calculation
    // This typically involves combining the smoother's output with the standard loss.
    // The exact usage may vary, so always check the header.
    // For LabelSmoothing, it acts as a loss itself.
    torch::Tensor smoothed_loss = label_smoother(logits, targets);
 
    std::cout << "Standard Cross Entropy Loss: "
              << cross_entropy_loss(logits, targets).item<float>() << std::endl;
    std::cout << "Loss with Label Smoothing: "
              << smoothed_loss.item<float>() << std::endl;
 
    // 4. In a Trainer, you would set this as your loss function
    xt::Trainer trainer;
    trainer.set_loss_fn(label_smoother);
    // trainer.fit(...);
}

Available Regularization Techniques

Below is the list of regularization modules available in the xt::regulariztions module.


`ActivationRegularization`	`ALS`	`AuxiliaryBatchNormalization`	`BatchNuclearNormMaximization`
`DiscriminativeRegularization`	`EntropyRegularization`	`EuclideanNormRegularization`	`Fierce`
`GANFeatureMatching`	`GMVAE`	`LabelSmoothing`	`LayerScale`
`LCC`	`LVR`	`ManifoldMixup`	`OffDiagonalOrthogonalRegularization`
`OrthogonalRegularization`	`PathLengthRegularization`	`PGM`	`R1Regularization`
`Rome`	`SCN`	`ShakeShakeRegularization`	`SRN`
`StochasticDepth`	`STTP`	`SVDParameterization`	`TargetPolicySmoothing`
`TemporalActivationRegularization`	`WeightsReset`

!!! info "Implementation Details" The implementation of regularization techniques can vary greatly. Some act like loss functions, others are modules to be applied in the forward pass, and some might be implemented as training callbacks. Always refer to the specific header file in <xtorch/regulariztions/> for detailed usage instructions and available options.