nanochat

Forked from karpathy/nanochat. This fork makes it easy to do pretraining architecture research and RL algo research.

nanochat

Pretraining architecture research on the nanochat stack.

Adding a new architecture

Create a new file nanochat/model/gpt_yourmodel.py with a config dataclass and model class. The model must implement the same interface as GPT: forward(idx, targets, kv_cache, loss_reduction), init_weights(), setup_optimizer(...), estimate_flops(), num_scaling_params(), generate(...), get_device().
Register it in nanochat/model_registry.py:

def _register_variants():
    from nanochat.model.gpt_yourmodel import YourModelConfig, YourModel
    register("yourmodel", YourModelConfig, YourModel)

The model_type is saved in checkpoint metadata, so evaluation auto-detects the right model class.

Training

# Setup (one-time)
uv sync --extra gpu
source .venv/bin/activate
python -m nanochat.dataset -n 170   # download data
python -m scripts.tok_train          # train tokenizer

# Train a model
torchrun --standalone --nproc_per_node=4 -m scripts.base_train -- \
    --depth=12 --model-type=gpt --model-tag=my_experiment

Evaluation

torchrun --standalone --nproc_per_node=4 -m scripts.base_eval -- --model-tag=my_experiment

Leaderboard

FLOP-controlled, depth-24 runs. Validation bpb is the minimum reached over training (lower is better).

Model	Mechanism	Min val bpb
karpathy/nanochat default	—	0.750794
AttnRes	Softmax attention over all prior sublayer outputs	0.742577
karpathy/nanochat autoresearch 2	—	0.71800
AttnRes + load balancing	AttnRes with load-balancing aux loss	0.698283

nanoRL

Standalone RL package (nanorl/) that runs on top of HuggingFace base models (e.g. Qwen3-0.6B) with vLLM for rollouts. See knowledge/rl_training.md for full docs.

Adding a loss

Add a function to nanorl/loss.py with signature loss_fn(logprobs, old_logprobs, rewards, **kwargs) -> scalar and register it in ALGORITHMS. If it needs a non-standard advantage estimator, add a branch to compute_advantages.

Adding data

The pipeline consumes JSONL with rows {id, prompt, kind, payload, meta} where kind selects a verifier (code_call_based, code_stdin_stdout, ...). To add a task:

Write a prep script in nanorl/scripts/ that emits the JSONL.
Register the path in _RL_DATASET_PATHS in nanorl/data.py.
For a new kind, add an elif branch in nanorl.data.verify and a sibling helper.

Training

./nanorl/runs/train.sh [tag]

Launches a vLLM rollout worker on ROLLOUT_GPU and a torchrun trainer on TRAIN_GPUS (set inside the script). Strict-sync: each step's rollouts come from the just-checkpointed weights.

Cite

If you build on this fork:

@misc{nanochat-arch,
  author = {Muyu He},
  title = {nanochat-arch: Architecture experiments on nanochat},
  year = {2026},
  publisher = {GitHub},
  url = {https://github.com/RiddleHe/nanochat}
}

This fork is based on:

@misc{nanochat,
  author = {Andrej Karpathy},
  title = {nanochat: The best ChatGPT that \$100 can buy},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/karpathy/nanochat}
}

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
.claude/skills/read-arxiv-paper		.claude/skills/read-arxiv-paper
configs		configs
dev		dev
knowledge		knowledge
nanochat		nanochat
nanorl		nanorl
runs		runs
scripts		scripts
tasks		tasks
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nanochat

nanochat

Adding a new architecture

Training

Evaluation

Leaderboard

nanoRL

Adding a loss

Adding data

Training

Cite

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Languages

Folders and files

Latest commit

History

Repository files navigation

nanochat

nanochat

Adding a new architecture

Training

Evaluation

Leaderboard

nanoRL

Adding a loss

Adding data

Training

Cite

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 0

Languages

Packages

Contributors