NebTorch

NebTorch is a minimal Autograd engine built from scratch using NumPy, inspired by PyTorch’s automatic differentiation system.

In 11-785: Introduction to Deep Learning, a graduate-level course at CMU taught by Prof. Bhiksha Raj Ramakrishnan, I completed a sequence of assignments covering everything from foundational concepts to advanced topics in Deep Learning — including neural networks, optimizations, and more. The course provided both theoretical and practical understanding of neural networks, along with a brief introduction to Autograd.

After completing the course, I was inspired to dive deeper and build my own Autograd engine from scratch. Building NebTorch has been very rewarding—I’ve solidified my understanding of Deep Learning and Automatic Differentiation, and most of all, I’ve gained appreciation for frameworks such as PyTorch and TensorFlow.

Most of the course content is openly available online: https://deeplearning.cs.cmu.edu/F24/index.html

Quick Start Example

Here's a complete example demonstrating how to use NebTorch to train a simple Multi-Layer Perceptron (MLP) on the Iris dataset:

Imports

import numpy as np
from nebtorch import Module, Tensor
from nebtorch.nn import Linear, ReLU, CrossEntropyLoss, Softmax
from nebtorch.optim import SGD
from sklearn import datasets
from sklearn.model_selection import train_test_split

Define the Model

class MLP(Module):
    def __init__(self, in_features: int, out_features: int):
        super().__init__()
        self.linear_1 = Linear(in_features=in_features, out_features=256)
        self.act = ReLU()
        self.linear_2 = Linear(in_features=256, out_features=out_features)

    def forward(self, input: Tensor):
        out = self.linear_1(input)
        out = self.act(out)
        logits = self.linear_2(out)
        return logits

Data Preparation

# Load and prepare data
iris = datasets.load_iris()
X = iris.data
Y = iris.target

X_train, X_test, Y_train, Y_test = train_test_split(
    X, Y, test_size=0.2, random_state=42
)

# Convert to NebTorch tensors
X_train = nebtorch.tensor(X_train)
Y_train = nebtorch.tensor(Y_train)
X_test = nebtorch.tensor(X_test)
Y_test = nebtorch.tensor(Y_test)

Model Setup

# Hyperparameters
INPUT_FEATURES = X_train.shape[1]
NUM_CLASSES = np.max(Y) + 1
EPOCHS = 100
BATCH_SIZE = 5

# Initialize model, loss, and optimizer
model = MLP(INPUT_FEATURES, NUM_CLASSES)
criterion = CrossEntropyLoss()
optimizer = SGD(model.parameters(), lr=0.01)

Training Loop

num_batches = X_train.shape[0] // BATCH_SIZE

for epoch in range(EPOCHS):
    for i in range(num_batches):
        model.train()
        optimizer.zero_grad()

        # Get batch
        start_idx = i * BATCH_SIZE
        end_idx = start_idx + BATCH_SIZE
        input = X_train[start_idx:end_idx]
        target = Y_train[start_idx:end_idx]

        # Forward pass
        out = model(input)
        loss = criterion(out, target)

        # Backward pass
        loss.backward()
        optimizer.step()

    # Print progress
    if epoch % 10 == 0:
        print(f"Epoch {epoch:3d} | Loss: {loss.data.item():.4f}")

Evaluation

# Evaluate on test set
model.eval()
out = model(X_test)
loss = criterion(out, Y_test)

# Calculate accuracy
softmax = Softmax(dim=1)
predictions = np.argmax(softmax(out).data, axis=1)
accuracy = np.sum(predictions == Y_test.data) / Y_test.shape[0] * 100
print(f"Test Accuracy: {accuracy:.2f}%")

Implementation Overview

Base Classes

Component	Description
Module	Base class for all neural network modules
Tensor	Multi-dimensional datastructure with automatic differentiation support
Parameter	Special tensor for trainable model parameters
Optimizer	Base class for all optimizers

Tensor Operations

Component	Description
Add	Element-wise addition with broadcasting
Subtract	Element-wise subtraction with broadcasting
Negate	Element-wise negation
Multiply	Element-wise multiplication with broadcasting
Divide	Element-wise division with broadcasting
Matrix Multiplication	Matrix multiplication (@ operator)
Transpose	Matrix transposition
Reshape	Tensor reshaping
Log	Natural logarithm
Exp	Exponential function
Power	Element-wise power operation
Mean	Mean reduction with axis support
Variance	Variance reduction with axis support
Sum	Sum reduction with axis support
Max	Maximum reduction with axis support
Slice	Tensor indexing and slicing

Activation Functions

Component	Description
Sigmoid	Sigmoid activation function
Tanh	Hyperbolic tangent activation
ReLU	Rectified Linear Unit
GELU	Gaussian Error Linear Unit
Softmax	Softmax with dimension support

Neural Network Layers

Component	Description
Linear	Fully connected layer
Conv1d_stride1	1D convolution with stride 1
Conv2d_stride1	2D convolution with stride 1
Conv2d	2D convolution with configurable stride
MaxPool2d_stride1	2D max pooling with stride 1
MeanPool2d_stride1	2D mean pooling with stride 1
MaxPool2d	2D max pooling with configurable stride
MeanPool2d	2D mean pooling with configurable stride
BatchNorm1d	1D batch normalization
LayerNorm	Layer normalization
Dropout	Dropout regularization
Embedding	Embedding layer for sparse inputs

Recurrent Neural Networks

Component	Description
RNNCell	Recurrent neural network cell
GRUCell	Gated Recurrent Unit cell

Upsampling/Downsampling

Component	Description
Upsampling1d	1D upsampling
Downsample1d	1D downsampling
Upsample2d	2D upsampling
Downsample2d	2D downsampling

Attention Mechanisms

Component	Description
MultiheadAttention	Multi-head attention mechanism
Scaled Dot-Product Attention	Scaled dot-product attention

Loss Functions

Component	Description
Loss	Base class for all loss functions
MSELoss	Mean Squared Error loss
CrossEntropyLoss	Cross-entropy loss with softmax

Optimizers

Component	Description
SGD	Stochastic Gradient Descent

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
nebtorch		nebtorch
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NebTorch

Most of the course content is openly available online: https://deeplearning.cs.cmu.edu/F24/index.html

Quick Start Example

Imports

Define the Model

Data Preparation

Model Setup

Training Loop

Evaluation

Implementation Overview

Base Classes

Tensor Operations

Activation Functions

Neural Network Layers

Recurrent Neural Networks

Upsampling/Downsampling

Attention Mechanisms

Loss Functions

Optimizers

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NebTorch

Most of the course content is openly available online: https://deeplearning.cs.cmu.edu/F24/index.html

Quick Start Example

Imports

Define the Model

Data Preparation

Model Setup

Training Loop

Evaluation

Implementation Overview

Base Classes

Tensor Operations

Activation Functions

Neural Network Layers

Recurrent Neural Networks

Upsampling/Downsampling

Attention Mechanisms

Loss Functions

Optimizers

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages