This project uses the Measuring Massive Multitask Language Understanding (MMLU) benchmark. MMLU Citation: Hendrycks et al. (2021). Measuring Massive Multitask Language Understanding. ICLR 2021. Ethics Citation: Hendrycks et al. (2021). Aligning AI With Shared Human Values. ICLR 2021.