CardioBench 🫀

Do Echocardiography Foundation Models Generalize Beyond the Lab?

CardioBench is a standardized benchmark that unifies 8 public echocardiography datasets spanning 4 regression and 5 classification tasks, evaluating cardiac-specific, biomedical, and general-purpose foundation models under zero-shot, probing, and alignment protocols.

Preprint · Datasets · Getting Started · Evaluation · Prediction Formats · Citation

Datasets & Tasks

Repository Layout

data/ – split CSVs generated by the workflow.
docs/downloads.md – where and how to obtain each dataset.
evaluation/ – per-dataset evaluation scripts, config, sample predictions, and helpers.
src/ – training / probing utilities and baseline model code.

Getting Started

1. Download Data

Follow the per-dataset instructions in docs/downloads.md.

2. Configure Paths

Edit evaluation/config.py to point to:

Your prediction folders.
Desired output directories.

The config file also holds shared constants such as bootstrap count (B), seed, view labels, and evaluation split (SPLIT="test" by default).

Running Evaluations

Each dataset/task has a standalone script:

python evaluation/<script>.py
# Example:
python evaluation/camus.py

Outputs (metrics CSVs, per-class accuracy, prediction histograms, plots, etc.) are written to the directories configured in evaluation/config.py.

Helpful tips:

Ensure prediction IDs (patient_id, HashedFileName, FileName, etc.) match the split CSVs exactly.
View classifiers must include prob_<VIEW> columns for every entry in VIEW_CLASS_NAMES plus view_pred.

Prediction Formats

Use the CSVs inside evaluation/example_predictions/ as templates.

View models can omit prob_Other; the evaluator derives it as 1 - sum(prob_<known view>).
Multi-target datasets (e.g., EchoNet-LVH) expect one CSV per measurement, named <TARGET>_pred.
When in doubt, mirror the filenames inside evaluation/example_predictions/<DATASET>/.

Baselines & Protocols

src/ contains code for zero-shot similarity, language-aligned prompting, and linear probes on top of frozen encoders.

Contributing

Contributions are welcome! Please open an issue or pull request for:

Evaluation tasks.
Bug fixes within the existing scripts.
Documentation or visualization improvements.

When submitting predictions or scripts, ensure you do not upload raw patient data—only derived metrics.

Citation

If CardioBench is useful for your work, please cite:

@article{taratynova2025cardiobench,
  title={CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?},
  author={Taratynova, Darya and Aly, Ahmed and Saeed, Numan and Yaqub, Mohammad},
  journal={arXiv preprint arXiv:2510.00520},
  year={2025}
}

Preprint: arXiv:2510.00520

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CardioBench 🫀

Datasets & Tasks

Repository Layout

Getting Started

1. Download Data

2. Configure Paths

Running Evaluations

Prediction Formats

Baselines & Protocols

Contributing

Citation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

CardioBench 🫀

Datasets & Tasks

Repository Layout

Getting Started

1. Download Data

2. Configure Paths

Running Evaluations

Prediction Formats

Baselines & Protocols

Contributing

Citation