⚡ Improve Metadata Handling for WSI Readers#1001
Conversation
- Try OpenSlide Reader for tiff files first - Fallback to calculating objective power from mpp
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #1001 +/- ##
========================================
Coverage 99.53% 99.53%
========================================
Files 83 83
Lines 11353 11397 +44
Branches 1493 1499 +6
========================================
+ Hits 11300 11344 +44
Misses 28 28
Partials 25 25 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
OpenSlideWSIReader for tiff Files FirstThere was a problem hiding this comment.
Pull request overview
This PR standardizes metadata inference (objective power and MPP) across WSI readers by introducing a shared inference helper and updating multiple reader implementations and tests. It also adjusts reader-selection priority so TIFF inputs are attempted via OpenSlide first.
Changes:
- Add a centralized
WSIReader._estimate_mpp_objective_power()and use it across multiple readers. - Update reader selection to try OpenSlide first for
.tif/.tiff. - Update remote samples and expand tests for DICOM/TIFF metadata inference and warning behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
tiatoolbox/wsicore/wsireader.py |
Adds OpenSlide-first selection for TIFF and introduces centralized MPP/objective-power inference used across readers. |
tiatoolbox/utils/misc.py |
Broadens typing for objective_power2mpp to accept np.ndarray. |
tiatoolbox/data/remote_samples.yaml |
Updates DICOM sample filename and adds a second DICOM sample entry for new metadata tests. |
tests/test_wsireader.py |
Adds/updates DICOM metadata assertions and adds a new test for objective-power presence/inference. |
tests/test_tiffreader.py |
Updates OME-TIFF metadata tests and adds a warning-logging test for missing metadata. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
measty
left a comment
There was a problem hiding this comment.
This introduces a regression on that ome.tiff you sent me a while ago to check multi-channel reader on (20240625_144804_1_08TcnB_Kidney_panel_June_RP_52top51bottom.ome.tiff).
In develop, pyramid is seen:
Whereas opening the same slide in this PR:
No pyramid is seen so it is slow and uses loads of memory, and the slide doesn't display right (seems to be black & white?)
@measty This commit 4c0ba9a resolves this issue. I have tested the WSI Registration notebook and the mIF images, both work fine now. However, it fails on MONKEY challenge image. |
- Update wsireader mpp checks
## TIAToolbox v2.0.0 (2026-03-11) ### ✨ Major Updates and Feature Improvements #### ⚙️ Engine Redesign (PR #578) TIAToolbox 2.0.0 introduces a completely re-engineered inference engine designed for significant performance, scalability, and memory-efficiency improvements. #### Key Enhancements - A modern processing stack built on **Dask** (parallel/distributed execution) and **Zarr** (chunked, out-of-core storage) - **Standardised output formats** across all engines: - Python `dict` - **Zarr** - **AnnotationStore** (SQLite-backed) - **QuPath JSON** - Cleaner runtime behavior with reduced warning noise and a unified progress bar - More predictable memory usage through chunked streaming - Broader test coverage across engine components ### 🗺️ Improved QuPath Support Enhancements include: - Better handling of **GeoJSON** - Support for **multipoint geometries** (#841) - Improved semantic output helpers: - `dict_to_store_semantic_segmentor` (#926) - OME-TIFF probability overlays (#929) ### 🔬 New Nucleus Detection Engine A dedicated nucleus detection pipeline has been added, built on the redesigned engine for improved accuracy and efficient large-scale processing. #### 🧠 KongNet Model Family TIAToolbox 2.0.0 introduces **KongNet**, a high-performance architecture that achieved top results across multiple international challenges: - 🥇 **1st place: MONKEY Challenge (overall detection)** - 🥇 **1st place: MIDOG (mitosis detection)** - ⭐ Top-tier performance on **PUMA** Multiple pretrained variants are available (CoNIC, PanNuke, MONKEY, PUMA, MIDOG), each with standardised IO configurations. ### 🧬 Expanded Foundation Model Support Additional foundation models are now supported (#906), broadening the range of high-capacity architectures available for feature extraction and downstream tasks. ### 🖼️ SAM Segmentation in TIAViz TIAViz now integrates Meta’s Segment Anything Model (SAM), enabling: - Interactive segmentation - Rapid region extraction - Exploratory annotation workflows Simplified SAM usage (#968) streamlines its integration into analysis pipelines. ### 🧩 Enhanced WSIReader & Metadata Handling Major improvements include: - More robust cross-vendor **metadata extraction** (#1001) - **Multichannel image support** (PR #825) for immunofluorescence and non-RGB modalities - Simplified Windows installation using `openslide-bin` (no manual DLL steps) - macOS Tileserver fix (#976) - Improved DICOM reading (#934) ### ☁️ New Cloud-Native Reader: FsspecJSONWSIReader (PR #897) A new reader supporting **fsspec-compatible filesystems**, enabling seamless access to WSIs stored on: - S3 - GCS - Azure - HPC clusters - Any fsspec-supported backend This enables cloud-native and distributed data workflows. Contributed by @aacic ### 🤗 Pretrained Models Migrated to Hugging Face All pretrained models and sample assets have been migrated (#945, #983), improving: - Download reliability - Versioning and reproducibility - Caching and CI integration - Licensing clarity per model family ### 🛡️ Security, Compatibility & Tooling #### 🔐 Security & Dependency Updates - Dependency upgrades - Internal security improvements - Explicit workflow permissions added (#1021, #1023) #### 🐍 Python Version Support - **Dropped:** Python **3.9** - **Added:** Python **3.13** - **Supported:** Python 3.10–3.13 - Updated CUDA wheel source to **cu126** #### 🛠️ Developer Tooling & CI/CD - Expanded **mypy** type-checking coverage (#912, #931, #935, #951) - Updated pre-commit hooks and general formatting - CI uses **CPU-only PyTorch** for faster, more reliable builds (#974, #979) - Updated pip install workflow (#1013) - Added new **Python 3.13 Docker images** (#1014, #1019) ### 🧹 Bug Fixes & Stability Improvements - Fixed multi-GPU behaviour with `torch.compile` (#923) - Fixed DICOM reading issue (#934) - Fixed annotation contour handling with holes (#956) - Fixed consecutive annotation load bug (#927) - Fixed SCCNN model issues (#970) - Fixed MapDe `dist_filter` shape issue (#914) - Improved notebook reliability on Colab (#1026–#1030) - macOS TileServer issues resolved (#976) ### 🧭 Migration Guide for Users #### 🔄 Updating from 1.x to 2.0.0 #### Update calls: replace `.predict()` with `.run()` ```python # Old results = segmentor.predict(imgs=[...], ioconfig=config) # New results = segmentor.run(images=[...], ioconfig=config) ``` #### Use `patch_mode`: replace `mode="patch"` with `patch_mode=True` and `mode="tile"` or "wsi" with `patch_mode=False` ```python # Old results = segmentor.predict(imgs=[...], mode="patch", ioconfig=config) # New results = segmentor.run(images=[...], patch_mode=True, ioconfig=config) ``` ```python # Old results = segmentor.predict(imgs=[...], mode="wsi", ioconfig=config) # New results = segmentor.run(images=[...], patch_mode=False, ioconfig=config) ``` #### Use the new I/O configs ```python from tiatoolbox.models.engine.io_config import IOSegmentorConfig config = IOSegmentorConfig( patch_input_shape=(256, 256), stride_shape=(240, 240), input_resolutions=[{"resolution": 0.25, "units": "mpp"}], save_resolution={"units": "baseline", "resolution": 1.0} ) ``` #### Specify the output format ```python results = segmentor.run( images=[...], ioconfig=ioconfig, output_type="zarr", # or "dict", "annotationstore", "qupath" save_dir="outputs/" ) ``` #### Update imports - `tiatoolbox.typing` → `tiatoolbox.type_hints` #### Install requirements - Python **3.10+** required - On Windows: install OpenSlide via `pip install openslide-bin` **Full Changelog:** v1.6.0...v2.0.0 --------- Signed-off-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com> Co-authored-by: measty <20169086+measty@users.noreply.github.com> Co-authored-by: Jiaqi-Lv <60471431+Jiaqi-Lv@users.noreply.github.com> Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com> Co-authored-by: Mostafa Jahanifar <74412979+mostafajahanifar@users.noreply.github.com> Co-authored-by: John Pocock <John-P@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Yijie Zhu <120978607+YijieZhu15@users.noreply.github.com> Co-authored-by: Aleksandar Acic <32873451+aacic@users.noreply.github.com> Co-authored-by: Abdol A <u2271662@live.warwick.ac.uk> Co-authored-by: Abishek <abishekraj6797@gmail.com> Co-authored-by: behnazelhaminia <30952176+behnazelhaminia@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Adam Shephard <adam.shephard@warwick.ac.uk> Co-authored-by: gozdeg <gozdegunesli@gmail.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: mbasheer04 <78800844+mbasheer04@users.noreply.github.com> Co-authored-by: vqdang <24943262+vqdang@users.noreply.github.com>
🔖 Release 2.0.0 (#1031) ## TIAToolbox v2.0.0 (2026-03-11) ### ✨ Major Updates and Feature Improvements #### ⚙️ Engine Redesign (PR #578) TIAToolbox 2.0.0 introduces a completely re-engineered inference engine designed for significant performance, scalability, and memory-efficiency improvements. #### Key Enhancements - A modern processing stack built on **Dask** (parallel/distributed execution) and **Zarr** (chunked, out-of-core storage) - **Standardised output formats** across all engines: - Python `dict` - **Zarr** - **AnnotationStore** (SQLite-backed) - **QuPath JSON** - Cleaner runtime behavior with reduced warning noise and a unified progress bar - More predictable memory usage through chunked streaming - Broader test coverage across engine components ### 🗺️ Improved QuPath Support Enhancements include: - Better handling of **GeoJSON** - Support for **multipoint geometries** (#841) - Improved semantic output helpers: - `dict_to_store_semantic_segmentor` (#926) - OME-TIFF probability overlays (#929) ### 🔬 New Nucleus Detection Engine A dedicated nucleus detection pipeline has been added, built on the redesigned engine for improved accuracy and efficient large-scale processing. #### 🧠 KongNet Model Family TIAToolbox 2.0.0 introduces **KongNet**, a high-performance architecture that achieved top results across multiple international challenges: - 🥇 **1st place: MONKEY Challenge (overall detection)** - 🥇 **1st place: MIDOG (mitosis detection)** - ⭐ Top-tier performance on **PUMA** Multiple pretrained variants are available (CoNIC, PanNuke, MONKEY, PUMA, MIDOG), each with standardised IO configurations. ### 🧬 Expanded Foundation Model Support Additional foundation models are now supported (#906), broadening the range of high-capacity architectures available for feature extraction and downstream tasks. ### 🖼️ SAM Segmentation in TIAViz TIAViz now integrates Meta’s Segment Anything Model (SAM), enabling: - Interactive segmentation - Rapid region extraction - Exploratory annotation workflows Simplified SAM usage (#968) streamlines its integration into analysis pipelines. ### 🧩 Enhanced WSIReader & Metadata Handling Major improvements include: - More robust cross-vendor **metadata extraction** (#1001) - **Multichannel image support** (PR #825) for immunofluorescence and non-RGB modalities - Simplified Windows installation using `openslide-bin` (no manual DLL steps) - macOS Tileserver fix (#976) - Improved DICOM reading (#934) ### ☁️ New Cloud-Native Reader: FsspecJSONWSIReader (PR #897) A new reader supporting **fsspec-compatible filesystems**, enabling seamless access to WSIs stored on: - S3 - GCS - Azure - HPC clusters - Any fsspec-supported backend This enables cloud-native and distributed data workflows. Contributed by @aacic ### 🤗 Pretrained Models Migrated to Hugging Face All pretrained models and sample assets have been migrated (#945, #983), improving: - Download reliability - Versioning and reproducibility - Caching and CI integration - Licensing clarity per model family ### 🛡️ Security, Compatibility & Tooling #### 🔐 Security & Dependency Updates - Dependency upgrades - Internal security improvements - Explicit workflow permissions added (#1021, #1023) #### 🐍 Python Version Support - **Dropped:** Python **3.9** - **Added:** Python **3.13** - **Supported:** Python 3.10–3.13 - Updated CUDA wheel source to **cu126** #### 🛠️ Developer Tooling & CI/CD - Expanded **mypy** type-checking coverage (#912, #931, #935, #951) - Updated pre-commit hooks and general formatting - CI uses **CPU-only PyTorch** for faster, more reliable builds (#974, #979) - Updated pip install workflow (#1013) - Added new **Python 3.13 Docker images** (#1014, #1019) ### 🧹 Bug Fixes & Stability Improvements - Fixed multi-GPU behaviour with `torch.compile` (#923) - Fixed DICOM reading issue (#934) - Fixed annotation contour handling with holes (#956) - Fixed consecutive annotation load bug (#927) - Fixed SCCNN model issues (#970) - Fixed MapDe `dist_filter` shape issue (#914) - Improved notebook reliability on Colab (#1026–#1030) - macOS TileServer issues resolved (#976) ### 🧭 Migration Guide for Users #### 🔄 Updating from 1.x to 2.0.0 #### Update calls: replace `.predict()` with `.run()` ```python # Old results = segmentor.predict(imgs=[...], ioconfig=config) # New results = segmentor.run(images=[...], ioconfig=config) ``` #### Use `patch_mode`: replace `mode="patch"` with `patch_mode=True` and `mode="tile"` or "wsi" with `patch_mode=False` ```python # Old results = segmentor.predict(imgs=[...], mode="patch", ioconfig=config) # New results = segmentor.run(images=[...], patch_mode=True, ioconfig=config) ``` ```python # Old results = segmentor.predict(imgs=[...], mode="wsi", ioconfig=config) # New results = segmentor.run(images=[...], patch_mode=False, ioconfig=config) ``` #### Use the new I/O configs ```python from tiatoolbox.models.engine.io_config import IOSegmentorConfig config = IOSegmentorConfig( patch_input_shape=(256, 256), stride_shape=(240, 240), input_resolutions=[{"resolution": 0.25, "units": "mpp"}], save_resolution={"units": "baseline", "resolution": 1.0} ) ``` #### Specify the output format ```python results = segmentor.run( images=[...], ioconfig=ioconfig, output_type="zarr", # or "dict", "annotationstore", "qupath" save_dir="outputs/" ) ``` #### Update imports - `tiatoolbox.typing` → `tiatoolbox.type_hints` #### Install requirements - Python **3.10+** required - On Windows: install OpenSlide via `pip install openslide-bin` **Full Changelog:** v1.6.0...v2.0.0 --------- Signed-off-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com> Co-authored-by: measty <20169086+measty@users.noreply.github.com> Co-authored-by: Jiaqi-Lv <60471431+Jiaqi-Lv@users.noreply.github.com> Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com> Co-authored-by: Mostafa Jahanifar <74412979+mostafajahanifar@users.noreply.github.com> Co-authored-by: John Pocock <John-P@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Yijie Zhu <120978607+YijieZhu15@users.noreply.github.com> Co-authored-by: Aleksandar Acic <32873451+aacic@users.noreply.github.com> Co-authored-by: Abdol A <u2271662@live.warwick.ac.uk> Co-authored-by: Abishek <abishekraj6797@gmail.com> Co-authored-by: behnazelhaminia <30952176+behnazelhaminia@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Adam Shephard <adam.shephard@warwick.ac.uk> Co-authored-by: gozdeg <gozdegunesli@gmail.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com> Co-authored-by: mbasheer04 <78800844+mbasheer04@users.noreply.github.com> Co-authored-by: vqdang <24943262+vqdang@users.noreply.github.com>




Summary
This PR standardises and improves metadata inference across all WSI readers by introducing a unified mechanism for estimating missing objective power and MPP. It updates all major reader implementations (TIFF, DICOM, OpenSlide, JP2, NGFF, fsspec), fixes reader‑selection ordering, and adds extensive tests to validate inference behaviour and warnings. New sample data is included to support expanded DICOM metadata coverage.
🔑 Key Changes
1. Centralised Metadata Inference
WSIReader._estimate_mpp_objective_power()as the shared method for inferring missing objective power and MPP.2. Unified Metadata Handling Across Readers
All major WSI readers now use the central inference method:
TIFFWSIReaderDICOMWSIReaderOpenSlideWSIReaderJP2WSIReaderNGFFWSIReaderFsspecJsonWSIReaderThis ensures consistent behaviour when metadata is missing or partially defined.
3. Improved Reader Selection Logic
try_openslide()and updates selection priority so TIFF files are first attempted via OpenSlide.4. Expanded and Strengthened Test Coverage
New and updated tests now cover:
dicom-2sample with known objective/MPP valuesAssertions have been updated to reflect the new inference logic.
5. Updated Remote Sample Data
CMU-1.dicom.zipwithCMU-1-Small-Region.dicom.zip.dicom-2sample (JP2K-33003-1.zip) to support metadata‑specific tests.6. Cleanup and Minor Fixes
TransformedWSIReader.objective_power2mpp.This PR resolves Jupyter Notebook 10 – WSI Reading (#998) and KongNet Notebook for MONKEY dataset (#987).