OME-Arrow uses Open Microscopy Environment (OME) specifications through Apache Arrow for fast, queryable, and language agnostic bioimage data.
Images are often left behind from the data model, referenced but excluded from databases.
OME-Arrow brings images back into the story.
OME Arrow enables image data to be stored alongside metadata or derived data such as single-cell morphology features. Images in OME Arrow are composed of mutlilayer structs so they may be stored as values within tables. This means you can store, query, and build relationships on data from the same location using any system which is compatible with Apache Arrow (including Parquet) through common data interfaces (such as SQL and DuckDB).
This package is intentionally dedicated to work at a per-image level and not large batch handling (though it may be used for those purposes by users or in other projects).
- For visualizing OME Arrow and OME Parquet data in Napari, please see the
napari-ome-arrowNapari plugin. - For more comprehensive handling of many images and features in the context of the OME Parquet format please see the
CytoDataFrameproject (and relevant example notebook).
Install OME Arrow from PyPI or from source:
# install from pypi
pip install ome-arrow
# install directly from source
pip install git+https://github.com/wayscience/ome-arrow.gitSee below for a quick start guide. Please also reference an example notebook: Learning to fly with OME-Arrow.
from ome_arrow import OMEArrow
# Ingest a tif image through a convenient OME Arrow class
# We can also ingest OME-Zarr or NumPy arrays.
oa_image = OMEArrow(
data="your_image.tif"
)
# Access the OME Arrow struct itself
# (compatible with Arrow-compliant data storage).
oa_image.data
# Show information about the image.
oa_image.info()
# Display the image with matplotlib.
oa_image.view(how="matplotlib")
# Display the image with pyvista
# (great for ZYX 3D images; install extras: `pip install 'ome-arrow[viz]'`).
oa_image.view(how="pyvista")
# Export to OME-Parquet.
# We can also export OME-TIFF, OME-Zarr or NumPy arrays.
oa_image.export(how="ome-parquet", out="your_image.ome.parquet")
# Export to Vortex (install extras: `pip install 'ome-arrow[vortex]'`).
oa_image.export(how="vortex", out="your_image.vortex")For tensor-focused workflows (PyTorch/JAX), use tensor_view and DLPack export.
from ome_arrow import OMEArrow
oa = OMEArrow("your_image.ome.parquet")
# Spatial ROI per plane (YX convention)
view = oa.tensor_view(t=0, z=0, roi=(32, 32, 128, 128), layout="CYX")
# Convenience 3D ROI (x, y, z, w, h, d)
view3d = oa.tensor_view(roi3d=(32, 32, 2, 128, 128, 4), layout="TZCYX")
# 3D tiled iteration over (z, y, x)
for cap in view3d.iter_tiles_3d(tile_size=(2, 64, 64), mode="numpy"):
passLazy scan-style convention (Polars-like):
from ome_arrow import OMEArrow
oa = OMEArrow.scan("your_image.ome.parquet") # deferred load
# First: queue lazy spatial/index slicing
lazy_crop = oa.slice_lazy(0, 512, 0, 512).slice_lazy(64, 256, 64, 256)
cropped = lazy_crop.collect()
# slice_lazy returns a new OMEArrow plan; collect does not mutate `oa`.
# Build tensor_view from the returned sliced object to reuse that plan.
tensor_view_result = cropped.tensor_view(t=0, z=slice(0, 4), roi=(0, 0, 192, 192))
arr = tensor_view_result.to_numpy()Advanced options:
chunk_policy="auto" | "combine" | "keep"controls ChunkedArray handling.channel_policy="error" | "first"controls behavior when droppingCfrom layout.
See full docs: docs/src/dlpack.md
Use the lightweight benchmark utility in benchmarks/ to compare lazy tensor
read paths (TIFF source-backed, Parquet planes, Parquet chunks):
uv run python benchmarks/benchmark_lazy_tensor.py --repeats 5 --warmup 1Notes:
- This benchmark is for local iteration and relative comparisons.
- It is not part of CI pass/fail checks.
- CI also runs this benchmark in a dedicated
benchmark_canaryjob and uploadsbenchmark-results.jsonas a workflow artifact.
Recalibrating benchmarks/ci-baseline.json:
- Run the benchmark on
maina few times (for example 3-5 runs):uv run python benchmarks/benchmark_lazy_tensor.py --repeats 7 --warmup 2 --json-out benchmark-results.json - For each case, collect the observed
median_msvalues. - Update
benchmarks/ci-baseline.jsonwith stable medians from those runs (prefer a conservative value near the slower side, not the fastest sample). - Keep CI canary tolerance (
regression_factor+absolute_slack_ms) unchanged unless you have repeated false positives.
Please see our contributing documentation for more details on contributions, development, and testing.
OME Arrow is used or inspired by the following projects, check them out!
napari-ome-arrow: enables you to view OME Arrow and related images.nViz: focuses on ingesting and visualizing various 3D image data.CytoDataFrame: provides a DataFrame-like experience for viewing feature and microscopy image data within Jupyter notebook interfaces and creating OME Parquet files.coSMicQC: performs quality control on microscopy feature datasets, visualized using CytoDataFrames.


