Skip to content

Latest commit

 

History

History
83 lines (64 loc) · 3.72 KB

File metadata and controls

83 lines (64 loc) · 3.72 KB

CardioBench 🫀

Do Echocardiography Foundation Models Generalize Beyond the Lab?

CardioBench is a standardized benchmark that unifies 8 public echocardiography datasets spanning 4 regression and 5 classification tasks, evaluating cardiac-specific, biomedical, and general-purpose foundation models under zero-shot, probing, and alignment protocols.

Preprint · Datasets · Getting Started · Evaluation · Prediction Formats · Citation

CardioBench overview

Datasets & Tasks

Repository Layout

  • data/ – split CSVs generated by the workflow.
  • docs/downloads.md – where and how to obtain each dataset.
  • evaluation/ – per-dataset evaluation scripts, config, sample predictions, and helpers.
  • src/ – training / probing utilities and baseline model code.

Getting Started

1. Download Data

Follow the per-dataset instructions in docs/downloads.md.

2. Configure Paths

Edit evaluation/config.py to point to:

  • Your prediction folders.
  • Desired output directories.

The config file also holds shared constants such as bootstrap count (B), seed, view labels, and evaluation split (SPLIT="test" by default).

Running Evaluations

Each dataset/task has a standalone script:

python evaluation/<script>.py
# Example:
python evaluation/camus.py

Outputs (metrics CSVs, per-class accuracy, prediction histograms, plots, etc.) are written to the directories configured in evaluation/config.py.

Helpful tips:

  • Ensure prediction IDs (patient_id, HashedFileName, FileName, etc.) match the split CSVs exactly.
  • View classifiers must include prob_<VIEW> columns for every entry in VIEW_CLASS_NAMES plus view_pred.

Prediction Formats

Use the CSVs inside evaluation/example_predictions/ as templates.

  • View models can omit prob_Other; the evaluator derives it as 1 - sum(prob_<known view>).
  • Multi-target datasets (e.g., EchoNet-LVH) expect one CSV per measurement, named <TARGET>_pred.
  • When in doubt, mirror the filenames inside evaluation/example_predictions/<DATASET>/.

Baselines & Protocols

  • src/ contains code for zero-shot similarity, language-aligned prompting, and linear probes on top of frozen encoders.

Contributing

Contributions are welcome! Please open an issue or pull request for:

  1. Evaluation tasks.
  2. Bug fixes within the existing scripts.
  3. Documentation or visualization improvements.

When submitting predictions or scripts, ensure you do not upload raw patient data—only derived metrics.

Citation

If CardioBench is useful for your work, please cite:

@article{taratynova2025cardiobench,
  title={CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?},
  author={Taratynova, Darya and Aly, Ahmed and Saeed, Numan and Yaqub, Mohammad},
  journal={arXiv preprint arXiv:2510.00520},
  year={2025}
}

Preprint: arXiv:2510.00520