This repository contains the official implementation of Competency Gaps, a representation-grounded evaluation method that uses sparse autoencoders (SAEs) to automatically surface both model gaps and benchmark gaps. The approach extracts SAE concepts and computes saliency-weighted performance scores to reveal why models succeed or fail and which concepts benchmarks over- or under-represent. Applied to multiple open-source LLMs and benchmarks, the method recovers known weaknesses without supervision.
Website — Paper — Contact us
Pre-print. Under review.
Code coming soon.
If you find our work useful, please consider citing our paper.
@inproceedings{tbd2026competencygaps,
title={Uncovering Competency Gaps in Large Language Models and Their Benchmarks},
author={TBD},
booktitle={TBD},
year={TBD}
}