Skip to content

maty-bohacek/competency-gaps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 

Repository files navigation

comp-gaps-banner

Uncovering Competency Gaps in Large Language Models and Their Benchmarks

This repository contains the official implementation of Competency Gaps, a representation-grounded evaluation method that uses sparse autoencoders (SAEs) to automatically surface both model gaps and benchmark gaps. The approach extracts SAE concepts and computes saliency-weighted performance scores to reveal why models succeed or fail and which concepts benchmarks over- or under-represent. Applied to multiple open-source LLMs and benchmarks, the method recovers known weaknesses without supervision.

Website — PaperContact us

Pre-print. Under review.

Getting Started

Code coming soon.

Citation

If you find our work useful, please consider citing our paper.

@inproceedings{tbd2026competencygaps,
  title={Uncovering Competency Gaps in Large Language Models and Their Benchmarks},
  author={TBD},
  booktitle={TBD},
  year={TBD}
}

About

Official implementation of the 'Uncovering Competency Gaps in Large Language Models and Their Benchmarks' paper

Topics

Resources

Stars

Watchers

Forks

Contributors