Micrograd-Lean

A tape-based reverse-mode automatic differentiation engine implemented in Lean 4, inspired by Andrej Karpathy's micrograd.

This project demonstrates:

Software engineering in Lean — a complete autodiff engine with neural network layers, loss functions, and SGD training.
Formal verification — Lean proofs that symbolic differentiation correctly implements the constant rule, sum rule, product rule, negation rule, and linearity.
Machine learning — trains on XOR (100% accuracy) and MNIST handwritten digits (~87% test accuracy with a small network).

Training curves

XOR (2→8→8→2, tanh, 2000 epochs)	MNIST (784→32→10, relu, 10 epochs)

How it Works

The engine uses a Wengert tape — a flat array where each entry stores an operation and its result. The forward pass appends nodes in topological order; the backward pass walks the tape in reverse, accumulating gradients.

For example, y = w·x + b with x=2, w=3, b=1:

graph BT
    n0["<b>#0 Const</b><br/>val = 2.0 &ensp; (x)<br/>grad = 3.0"]
    n1["<b>#1 Const</b><br/>val = 3.0 &ensp; (w)<br/>grad = 2.0"]
    n2["<b>#2 Const</b><br/>val = 1.0 &ensp; (b)<br/>grad = 1.0"]
    n3["<b>#3 Mul</b>(1,0)<br/>val = 6.0<br/>grad = 1.0"]
    n4["<b>#4 Add</b>(3,2)<br/>val = 7.0<br/>grad = 1.0"]

    n0 --> n3
    n1 --> n3
    n3 --> n4
    n2 --> n4

The tape is just #[node₀, node₁, node₂, node₃, node₄]. Each node only references earlier indices, so no sorting is needed — the array is the topological order.

The backward pass walks the tape in reverse, applying the chain rule at each node to accumulate gradients into a parallel array:

graph TD
    subgraph "i = 4 &ensp; Add(3,2) &ensp; grad[4] = 1.0"
        s4["grad[3] += 1.0 &ensp; grad[2] += 1.0"]
    end
    subgraph "i = 3 &ensp; Mul(1,0) &ensp; grad[3] = 1.0"
        s3["grad[1] += val[0] · 1.0 = 2.0<br/>grad[0] += val[1] · 1.0 = 3.0"]
    end
    subgraph "i = 2 &ensp; Const &ensp; grad[2] = 1.0"
        s2["(leaf — nothing to propagate)"]
    end
    subgraph "i = 1 &ensp; Const &ensp; grad[1] = 2.0"
        s1["(leaf)"]
    end
    subgraph "i = 0 &ensp; Const &ensp; grad[0] = 3.0"
        s0["(leaf)"]
    end

    s4 --> s3 --> s2 --> s1 --> s0

Result: grad[0] = 3.0 → ∂y/∂x = w = 3, and grad[1] = 2.0 → ∂y/∂w = x = 2.

Project Structure

Micrograd/
  Engine.lean              -- Tape-based reverse-mode autodiff
  NN.lean                  -- Layers, MLP, losses, SGD, RNG
  Data/
    Synthetic.lean         -- XOR and linear datasets
    MNIST.lean             -- IDX binary format loader
  Verified/
    Axioms.lean            -- Axiomatized Float field properties
    Expr.lean              -- Pure expression tree, eval, symbolic diff
    GradRules.lean         -- Derivative correctness proofs
Main.lean                  -- Unit tests, XOR training, MNIST training

Building and Running

Requires Lean 4 v4.28.0 (via elan).

lake build
lake exe micrograd

MNIST

Download MNIST data (requires curl and gunzip):

bash scripts/download_mnist.sh

Then re-run lake exe micrograd — MNIST training will run automatically.

Verified Derivative Rules

The Micrograd/Verified/ directory contains Lean proofs of derivative correctness for a pure expression tree (Expr). Proved theorems include:

Theorem	Statement
`grad_const`	d/dx(c) = 0
`grad_var_self`	d/dx(x) = 1
`grad_var_other`	d/dx(y) = 0 when y ≠ x
`grad_add`	d/dx(f + g) = df/dx + dg/dx
`grad_mul`	d/dx(f·g) = (df/dx)·g + f·(dg/dx)
`grad_neg`	d/dx(-f) = -(df/dx)
`grad_square'`	d/dx(x²) = x + x
`grad_linear_combo`	d/dx(a·f + b·g) = a·(df/dx) + b·(dg/dx)

These proofs rely on axiomatized Float field properties (see Axioms.lean). The axioms are clearly documented as assumptions — IEEE 754 floats do not perfectly satisfy field axioms due to NaN, infinity, and rounding.

References

R.E. Wengert, "A simple automatic derivative evaluation program," Communications of the ACM, 7(8):463–464, August 1964. The original formulation of tape-based automatic differentiation.
A. Karpathy, micrograd. The Python autodiff engine that inspired this project.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Micrograd		Micrograd
assets		assets
scripts		scripts
.gitignore		.gitignore
Main.lean		Main.lean
MicrogradLean.lean		MicrogradLean.lean
README.md		README.md
lake-manifest.json		lake-manifest.json
lakefile.lean		lakefile.lean
lean-toolchain		lean-toolchain

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Micrograd-Lean

Training curves

How it Works

Project Structure

Building and Running

MNIST

Verified Derivative Rules

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Micrograd-Lean

Training curves

How it Works

Project Structure

Building and Running

MNIST

Verified Derivative Rules

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages