Generalize deploy steps and add TOA xarray data structure by MariaRocco · Pull Request #91 · emit-sds/SpecTf

MariaRocco · 2026-02-07T01:13:08Z

Summary of changes:

Pulls out and adds functions for smaller general workflows that can be reused
adds a new way to start from a TOA xarray dataset and run inference as opposed to providing radiance asset paths.

Specific changes

spectf/dataset.py

New ToaDataset class that initializes the settings/parameters needed for downstream workflows
RasterDatasetTOA inherits from ToaDataset
New XarrayDatasetTOA class that inherits from ToaDataset that takes an input of an xarray top of atmosphere

spectf_cloud/deploy/infra_setup.py

adds a new open_model_arch_spec that parses and loads the model spec

spectf_cloud/deploy/deploy_pt.py

make the outfp input optional - if nothing is provided - no file will be saved
return the inference mask from the deploy_pt workflow
move core model setup to initialize_pt_model function
move core model run to run_pt_inference_model function
Uses the new initialize_pt_model and run_pt_inference_model in the deplot_pt function
Adds a new deploy_pt_from_toa function that takes an xarray dataset as an input instead of filepaths
uses the more generic open_model_arch_spec

spectf_cloud/deploy/deploy_trt.py

make the outfp input optional - if nothing is provided - no file will be saved
return the inference mask from the deploy_pt workflow
move core model setup + run to run_trt_inference_model function
Uses the new run_trt_inference_model in the deploy_trt function

other notes

I think there are some further optimizations we can do such as converting the pt/trt models into classes - they do similar things and this can be further improved using class structure. I'll make an issue for some of the ideas I had as I was looking through.

Let me know your thoughts!

pgbrodrick · 2026-02-07T03:22:38Z

Thanks for this PR @MariaRocco .

General thoughts below.

xarray stuff:
Presently xarray isn't a dependency - meaning to accept these changes we'd need to add it (modify the .toml). Before we go there though, can you help me on the requirement/need? My quick read is that the class addition here is basically a thin wrapper to support a native xarray input...but it seems this could be equally served by just casting when providing the input. I'm curious if I missed something or if you had other intents.

deploy modifications:
I see the desire here, and this makes sense. Approach-wise, my gut would be to add a new function that returns the array-based version, and then just call this inside the current deploy functions and add the output file write - this would mitigate downstream changes to currently operating code, while still giving the same functionality. Curious on @michaelkiper's thoughts though.

On style:
I don't believe we have an enforced or stated styleguide recommendation for this repo, so your changes are fair game. However they do make the PR harder to review as it's more challenging to isolate actual code changes. I would suggest minimizing these when possible to facilitate faster reviews.

MariaRocco · 2026-02-09T16:55:55Z

Hey @pgbrodrick - thanks of the feedback!

for the xarray portion:
we could change this to expect a numpy array instead of xarray - let me know if that's your preference and I will update. I went with xarray because that's the format our data is in and is a fairly common format for data cubes/data arrays that allows dimensions to be more descriptive - which would be nice for instance for ensuring access to the right dimension order when unpacking /reshaping the toa data. Numpy arrays will work too.

for the deploy portions:
the implementation in this PR shouldn't impact any downstream usage/workflows - the input format and outputs are all the same - just if no output filepath is provided - no file will get saved. Let me know if I'm missing something - but the way this is written now, any existing downstream usage of this code will have the same outputs.

on style: yeah that's definitely an oops - just a habit but you're right it does make review more cumbersome. I'll avoid that in the future.

michaelkiper · 2026-02-09T22:29:54Z

 import numpy as np
 from osgeo import gdal
+
+from spectf_cloud.deploy.infra_setup import open_model_arch_spec


[Nitpick] Please move this import to be with the other spectf_cloud imports. (ideally here)

michaelkiper · 2026-02-09T22:43:44Z

+    return padded_batch
+
+
+def run_trt_inference_model(dataset, engine, inference, device_):


[Question] Is this function part of your pipeline? I don't see any reference to the XArray dataset in this change.

If it is, I'd recommend adding it in.

michaelkiper · 2026-02-09T23:09:29Z

@MariaRocco Thank you for submitting a PR! We appreciate contributions.
Do you have a specific reason as to why returning a 2D matrix would be a useful contribution for you? Do you have some supplementary subroutine that would ingest this mask as a 2D matrix?

On XArray:
Yes, if you're able to make it use Numpy arrays as well, that'd be preferred. It'd help us use one less dependancy.

On Deployment Modifications:
@pgbrodrick yes, I agree that there is likely a less-invasive solution to getting the inputs from XArray and suppressing a file output, but I don't think that this implementation is unreasonable. If we're valuing PR turnaround time, it's looking fine to me. Adding in a new CLI flag/function parameter would be better, but not a killer if we value time.

@MariaRocco could you walk me through exactly how you use these functions? What is the parent process and what is the output subroutine you're looking to improve with these changes? That'd help me get a bigger picture. Thanks.

On Style:
Yes, I'd probably prefer to get rid of all the style related changes - they crowed out the intention of the PR. However, they're not a killer.

Ideally, we'd use an idealized version of black formatting, but for now, a good rule is simply to make it legible.

michaelkiper · 2026-02-09T23:10:08Z

I've left some comments on the PR. Once those are implemented, I'll test it to make sure functionality is the same.

michaelkiper · 2026-02-09T23:12:17Z

+    def __init__(
+        self,
+        toa: xr.DataArray,
+        rm_bands: List[List[int]] | None = None,


This type hint format made the 3.9 build fail - please use Optional instead.

====================================================================== ERROR: test_train_cli_commands (unittest.loader._FailedTest) ---------------------------------------------------------------------- ImportError: Failed to import test module: test_train_cli_commands Traceback (most recent call last): File "/opt/hostedtoolcache/Python/3.9.25/x64/lib/python3.9/unittest/loader.py", line 436, in _find_test_path module = self._get_module_from_name(name) File "/opt/hostedtoolcache/Python/3.9.25/x64/lib/python3.9/unittest/loader.py", line 377, in _get_module_from_name __import__(name) File "/home/runner/work/SpecTf/SpecTf/tests/test_train_cli_commands.py", line 12, in <module> from spectf_cloud.train import train File "/home/runner/work/SpecTf/SpecTf/spectf_cloud/train.py", line 22, in <module> from spectf.dataset import SpectraDataset File "/home/runner/work/SpecTf/SpecTf/spectf/dataset.py", line 137, in <module> class XarrayDatasetTOA(ToaDataset): File "/home/runner/work/SpecTf/SpecTf/spectf/dataset.py", line 146, in XarrayDatasetTOA rm_bands: List[List[int]] | None = None, TypeError: unsupported operand type(s) for |: '_GenericAlias' and 'NoneType'

MariaRocco · 2026-02-10T01:44:58Z

Excellent - thank you both for the review!

I am actually about to leave on 2 week vacation - so I just wanted to open the PR for @jakehlee (and y'all) to review while I'm out. I'm so happy you were able to take a look quickly :) but I won't be able to make these updates until I am back. Sorry about that.

In the meantime - our workflow is:

As we process tanager radiance data, we have a method for calculating TOA as an xarray dataset
this TOA (as an xarray dataset at the moment) we pass to deploy_pt_from_toa along with weights that our ML scientist has generated using tanager training datasets
we take the 2d array inference output and then save it as a cloud optimized geotiff, both ortho-rectified and unotho'd- so we do something very similar to what is done with the make_geotiff method - but returning the dataset allows us to add band definitions/other metadata we typically add to our geotiffs - so it just allows us to have slightly more control over the format + metadata in the output files.

One other note related to xarray - another benefit it adds - is instead of the band definitions being passed as a separate input variable - they can be part of the dataset - which just ensures the bands/toa dataset are the right sizes + aligned. We can still pass this data in as 2 separate numpy arrays, but I think there are benefits to using an xarray dataset.

MariaRocco · 2026-02-25T19:07:58Z

Alrighty - I am back from vacation - thanks for your patience. I believe I addressed all the comments - let me know if I missed anything though

michaelkiper · 2026-02-26T01:23:58Z

Alrighty - I am back from vacation - thanks for your patience. I believe I addressed all the comments - let me know if I missed anything though

@MariaRocco sweet! Thanks for the commits. I'll take a look.

MariaRocco · 2026-03-11T16:59:07Z

@michaelkiper and @pgbrodrick just wanted to touch base on this - have you been able to take a look / test these updates?

michaelkiper · 2026-03-11T17:14:49Z

@michaelkiper and @pgbrodrick just wanted to touch base on this - have you been able to take a look / test these updates?

@MariaRocco I haven't been able to yet. I can get them done by Friday for you though.

michaelkiper · 2026-03-17T20:29:07Z

@MariaRocco I was able to test the new commit you made. There are 3 critical changes that do have to be updated before we can merge this PR - should be very simple, I laid them out in these 3 comments:

Once those changes are made, I'm fine with approving this PR.

Hash showing equivalence between generating a tif with the old dev branch using PT/TRT, and this new PR branch using PT/TRT:

(SpecTf) [<> SpecTf]$ md5sum new-trt.tif
7b9840abc02f39cece7e91c773973bfa  new-trt.tif
(SpecTf) [<> SpecTf]$ md5sum old-trt.tif
7b9840abc02f39cece7e91c773973bfa  old-trt.tif
(SpecTf) [<>SpecTf]$ md5sum old-pt.tif
7b9840abc02f39cece7e91c773973bfa  old-pt.tif
(SpecTf) [<> SpecTf]$ md5sum new-pt.tif
7b9840abc02f39cece7e91c773973bfa  new-pt.tif

MariaRocco · 2026-03-18T21:35:57Z

Excellent - just made those changes :) thanks for the review!

michaelkiper · 2026-03-20T01:03:50Z

Excellent - just made those changes :) thanks for the review!

Perfect! All good with me. I can approve - @pgbrodrick do you sign off as well?

pgbrodrick · 2026-03-31T19:07:12Z

Thanks for the careful review @michaelkiper , and for the PR @MariaRocco !

Tests seem to pass with current ops workflows without any issues as well. Merging once workflow completes.

pgbrodrick · 2026-04-01T03:49:38Z

@MariaRocco , I opened a tiny PR to clear the 3.9 typing issue flagged by the test. Once you accept we can merge this in.

typing change

jakehlee · 2026-05-23T05:06:25Z

@MariaRocco apologies for leaving this open for as long as it has, let's get this merged in ASAP.

EDIT: we'll wait for regular business hours on the 26th to avoid merging on a late Friday night :)

mariarocco added 3 commits January 29, 2026 12:17

initial mods/rearrange of methods

ec6d0db

apply formatting

7903434

add additional notes + reshape cloud_mask before returning

f3171f0