A comprehensive benchmark for adapting foundation models to cytological image classification under few-shot settings.
📄 Paper: [Submitted to JMI]
This repository accompanies our work on benchmarking foundation models for cytological image classification in low-data regimes.
Cytology datasets are typically small and require expert annotations, making them ideal candidates for few-shot learning approaches. In this project, we evaluate multiple foundation models and parameter-efficient fine-tuning (PEFT) strategies across a diverse set of cytology datasets.
We compare:
- Vision Transformers (ViTs) and Vision-Language Models (VLMs)
- Different pretraining domains (natural, biomedical, histopathology)
- Several PEFT methods (LoRA, VPT, prompt learning, adapters)
All experiments are conducted in a few-shot setting (1 to 16 samples per class).
- LoRA consistently outperforms other PEFT methods for adapting foundation models
- Larger backbones improve performance, especially in extreme low-shot regimes
- Histopathology-pretrained models perform better in low-shot settings
- General-purpose models (e.g., CLIP) become competitive as more data is available
- Simple ensembling improves robustness and accuracy
This repository integrates several existing frameworks, adapted to support new backbones and cytology datasets.
CoOp/KgCoOp/TaskRes/Tip-Adapter/Prompt-align/multimodal-prompt-learning/Cytology-fine-tuning/
Each submodule originates from a different repository and has been adapted for:
- additional foundation models
- unified dataset handling
- consistent few-shot evaluation protocols
| Github | 🔗 Link |
|---|---|
| CoOp | 📥 Link |
| KgCoOp | 📥 Link |
| TaskRes | 📥 Link |
| Tip-Adapter | 📥 Link |
| Prompt-align | 📥 Link |
| multimodal-prompt-learning | 📥 Link |
| Cytology-fine-tuning | 📥 Link |
Two Python environments are used:
For CoOp, multimodal-prompt-learning, TaskRes, Tip-Adapter and Cytology-fine-tuning; we use the dassl environment.
This code is built on top of the awesome toolbox Dassl.pytorch so you need to install the dassl environment first. Simply follow the instructions described here to install dassl as well as PyTorch.
After that, run pip install -r requirements.txt under Cytology_Benchmark/ to install a few more packages required (this should be done when dassl is activated). Then, you are ready to go.
For Prompt-align and KgCoOp, we use dassl_pro environment. This code is built on top of the toolbox Dassl.ProGrad.pytorch.
After that, run pip install -r requirements.txt under Cytology_Benchmark/ to install a few more packages required (this should be done when dassl is activated). Then, you are ready to go.
We evaluate on 10 public cytological datasets covering multiple organs and classification tasks.
| Dataset | 🔗 Download Link |
|---|---|
| APACC | 📥 Link |
| BCFC | 📥 Link |
| BloodMNIST | 📥 Link |
| BMCD | 📥 Link |
| BMT | 📥 Link |
| FNAC | 📥 Link |
| Herlev | 📥 Link |
| HiCervix | 📥 Link |
| MLCC | 📥 Link |
| SIPaKMeD | 📥 Link |
See DATASETS.md for:
- download links
- preprocessing details
- dataset structure
Experiments are launched through bash scripts.
These scripts define the full experimental configuration, including:
- dataset
- model / backbone
- number of shots
- seed
- training hyperparameters
- output paths
They are designed to be easily adapted to new settings and can also be used with Slurm array jobs for large-scale runs.
Example:
bash scripts/launch_run.sh or scripts/main_ivlp.sh- Linear probing
- LoRA (Low Rank Adaptation)
- CoOp / CoCoOp / KgCoOp / ProGrad
- Tip-Adapter / TaskRes
- VPT (Visual Prompt Tuning)
- TPT (Textual Prompt Tuning)
- IVLP (Independant Visual Language Prompting)
All methods are adapted to work with multiple backbones as BiomedCLIP, PLIP, PubMedCLIP, QUILT and CONCH for VLM or DinoBLOOM and UNI for ViT.
If you have any questions, you can contact us by email: manon.dausort@uclouvain.be