Hyperparmaeter tuning via LLM for a two layer CNN on the MNIST dataset. Also an example of something that can be done, but probably shouldn't.
Uses:
- PyTorch for training the model
- OpenRouter for LLM API calls
Validation accuracy:
Reminder: no regression is used here. The improvements learned come strictly from the LLM's reasoning and analysis on the previous runs:
Example LLM reasoning (which leaves a lot of room for improvement):
uv syncThe MNIST dataset is downloaded from Hugging Face: https://huggingface.co/datasets/ylecun/mnist
uvx --from huggingface_hub hf download ylecun/mnist --repo-type dataset --local-dir dataUse the scripts/prepare_data.py script to split the dataset into train and validation sets.
This example runs for 5 generations with 4 training runs per generation.
uv run evolutionary-mnist experiments/evo-mini-v3.toml- Improve system prompt.
- Neural Architecture Search (NAS) for on-the-fly architecture exploration.
- Keep training time constant per run within each generation.


