Getting Started: Build and Train a Simple Neural Network
This tutorial follows the Quick Start example, but instead of using a pre-built model like LeNet5, we will define our own simple neural network from scratch.
This is a crucial step in learning any deep learning framework. You will learn how to:
- Define a custom network architecture by inheriting from
torch::nn::Module. - Register layers (like
LinearandReLU) within your model. - Implement the
forwardpass to define how data flows through your network. - Train your custom model using the
xt::Trainer.
Our goal is to build a simple Multi-Layer Perceptron (MLP) to classify digits from the MNIST dataset.
1. Defining the Network (Net module)
The core of any model in PyTorch or xTorch is a class that inherits from torch::nn::Module. Inside this class, you define the layers of your network and then implement the forward method, which specifies the computation that happens at every forward pass.
Our MLP will have a simple architecture:
- An input layer that flattens the 28x28 images into a 784-element vector.
- A linear layer that maps 784 features to 64 hidden features.
- A ReLU activation function.
- A second linear layer that maps the 64 hidden features to the 10 output classes.
- A LogSoftmax layer to convert the output logits into log-probabilities, which is suitable for use with the
nll_lossfunction.
#include <xtorch/xtorch.h>
// Define a custom module named 'Net'
struct Net : torch::nn::Module {
torch::nn::Linear fc1{nullptr}, fc2{nullptr};
// The constructor is where we define and register our layers.
Net() {
// First fully-connected layer (784 in, 64 out)
fc1 = register_module("fc1", torch::nn::Linear(784, 64));
// Second fully-connected layer (64 in, 10 out)
fc2 = register_module("fc2", torch::nn::Linear(64, 10));
}
// The forward method defines the data flow.
torch::Tensor forward(torch::Tensor x) {
// Flatten the image tensor from [Batch, 1, 28, 28] to [Batch, 784]
x = x.view({-1, 784});
// Apply the first linear layer, followed by a ReLU activation
x = torch::relu(fc1(x));
// Apply the second linear layer
x = fc2(x);
// Apply log_softmax to get log-probabilities for the loss function
return torch::log_softmax(x, /*dim=*/1);
}
};2. The Full Training Pipeline
The rest of the code is very similar to the Quick Start guide. We load the MNIST dataset, create a DataLoader, instantiate our custom Net model, define an optimizer, and then use the xt::Trainer to handle the entire training process.
Full C++ Code
Below is the complete source code for this example. The original file can be found at getting_started/building_and_training_a_simple_neural_network.cpp.
#include <xtorch/xtorch.h>
#include <iostream>
// --- 1. Define the custom Neural Network Module ---
struct Net : torch::nn::Module {
torch::nn::Linear fc1{nullptr}, fc2{nullptr};
Net() {
fc1 = register_module("fc1", torch::nn::Linear(784, 64));
fc2 = register_module("fc2", torch::nn::Linear(64, 10));
}
torch::Tensor forward(torch::Tensor x) {
x = x.view({-1, 784});
x = torch::relu(fc1(x));
x = fc2(x);
return torch::log_softmax(x, /*dim=*/1);
}
};
int main() {
// --- 2. Setup Device and Data ---
torch::Device device(torch::cuda::is_available() ? torch::kCUDA : torch::kCPU);
// Define a simple normalization transform
auto transform = std::make_unique<xt::transforms::Compose>(
std::make_shared<xt::transforms::general::Normalize>(0.5, 0.5)
);
auto dataset = xt::datasets::MNIST(
"./data",
xt::datasets::DataMode::TRAIN,
/*download=*/true,
std::move(transform)
);
xt::dataloaders::ExtendedDataLoader data_loader(dataset, 64, true);
// --- 3. Instantiate Model and Optimizer ---
auto model = std::make_shared<Net>();
model->to(device);
torch::optim::Adam optimizer(model->parameters(), torch::optim::AdamOptions(1e-3));
// --- 4. Configure and Run the Trainer ---
xt::Trainer trainer;
trainer.set_max_epochs(5)
.set_optimizer(optimizer)
.set_loss_fn(torch::nll_loss)
.add_callback(std::make_shared<xt::LoggingCallback>("[SimpleNN-MNIST]", 100));
trainer.fit(*model, data_loader, nullptr, device);
std::cout << "Training finished!" << std::endl;
return 0;
}How to Compile and Run
You can find this example in the xtorch-examples repository. To compile and run it:
- Navigate to the
getting_started/directory within the examples. - Create a build directory and use CMake:
mkdir build cd build cmake .. make - Run the compiled executable:
./build_simple_nn
Expected Output
You will see the training progress printed to the console, showing that your custom model is successfully learning to classify the MNIST digits.
[SimpleNN-MNIST] Epoch 1/5, Batch 100/938 - Loss: 0.4512345678 - Time per batch: ...ms
[SimpleNN-MNIST] Epoch 1/5, Batch 200/938 - Loss: 0.3125487566 - Time per batch: ...ms
...
Training finished!
This example demonstrates a fundamental skill: defining and training your own architectures. You can now use this pattern to build more complex and powerful models for your own tasks.
