Skip to content

Add Zarr support#190

Open
keller-mark wants to merge 155 commits intoscverse:develfrom
keller-mark:keller-mark/zarr
Open

Add Zarr support#190
keller-mark wants to merge 155 commits intoscverse:develfrom
keller-mark:keller-mark/zarr

Conversation

@keller-mark
Copy link
Copy Markdown

@keller-mark keller-mark commented Nov 5, 2024

Fixes #91

These changes are from both me and @Artur-man

The main public-facing changes here are:

  • The ZarrAnnData class
  • read_zarr and write_zarr top-level functions
  • Support for from_Seurat(output_class="ZarrAnnData")
  • Support for from_SingleCellExperiment(output_class="ZarrAnnData")

Internally:

  • read_zarr_helpers.R is the zarr analog of read_h5ad_helpers.R
  • write_zarr_helpers.R is the zarr analog of write_h5ad_helpers.R
  • Test fixtures within inst/extdata/example.zarr (this makes the diff noisy, apologies)
  • Lots of tests:
    • test-Zarr-read.R (35 new tests)
    • test-Zarr-write.R (70)
    • test-ZarrAnnData.R (26)
    • test-h5ad-zarr.R (17)

A number of these functions generate warnings in the R console that are intended to be followed up on to improve the code (and should probably be resolved as end users may not appreciate them), but the tests still pass despite these warnings.

Known things that are not implemented here:

  • support for recarrays
  • usage of mode = c("r", "r+", "a", "w", "w-", "x") parameter value

Copy link
Copy Markdown
Member

@rcannood rcannood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work @keller-mark and @Artur-man !

I went through the PR for a first time and left some minor comments. I will review the code by running it a couple of times next :)

Comment thread R/read_zarr.R
Comment thread R/read_zarr_helpers.R Outdated
Comment thread inst/extdata/example2.zarr/.zgroup Outdated
@lazappi lazappi mentioned this pull request Apr 15, 2026
6 tasks
@lazappi lazappi marked this pull request as ready for review April 15, 2026 14:57
@lazappi lazappi requested review from LouiseDck and rcannood April 15, 2026 14:57
@lazappi lazappi changed the title Zarr support Add Zarr support Apr 16, 2026
@Artur-man
Copy link
Copy Markdown

Artur-man commented Apr 17, 2026

@lazappi I think I did not check before finishing the hackathon. Do you set the default zarr version for write methods now ? or do you write only zarr v2 or v3 ?

Again here is the parameter from anndata to set the default behaviour:
https://github.com/keller-mark/anndataR/blob/528eff0024f73b39c5147d5eb15e7ad8b9b7bade/inst/scripts/example_files.py#L142

would be nice to do that too if we are not rushing bioc release.

@Artur-man
Copy link
Copy Markdown

Artur-man commented Apr 19, 2026

Ok I took another look at it. The roundtrip tests and write tests do not have zarr v3 tests but thats fine for now I think. What is more important is that we can read both versions. Since this is gonna stay as the devel branch for another 6 months I can throw another PR later. Lets not overdo it.

UPDATE: I wrote a version of test-Zarr-write.R loops over both v2 and v3 ... looks like its working.

@lazappi
Copy link
Copy Markdown
Collaborator

lazappi commented Apr 20, 2026

I thought there was something that was missing from {Rarr} that was blocking writing v3 but I forgot to open an issue and come back to it. If you think the extra tests are passing do you want to push them here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Zarr backend

6 participants