Skip to content

DTW DynaCLR Monorepo#398

Open
edyoshikun wants to merge 161 commits into
modular-viscy-stagingfrom
dynadtw
Open

DTW DynaCLR Monorepo#398
edyoshikun wants to merge 161 commits into
modular-viscy-stagingfrom
dynadtw

Conversation

@edyoshikun
Copy link
Copy Markdown
Member

No description provided.

edyoshikun and others added 30 commits March 31, 2026 13:43
Add normalization columns (norm_mean/std/median/iqr/max/min),
z_focus_mean, and TCZYX shape columns to the cell index schema.
preprocess_cell_index reads per-FOV zattrs and writes stats as
parquet columns for fast per-row normalization at training time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ExperimentRegistry.from_cell_index: build registry directly from
  preprocessed parquet + zarr metadata (no collection YAML needed)
- datamodule: cell_index_path as primary entry point, _train_final_crop
  changed from BatchedRandSpatialCropd to BatchedCenterSpatialCropd
  (random crop for Z/XY translation is now a user-configured augmentation)
- dataset: read norm stats from parquet columns, build_norm_meta fallback
- index: _align_parquet_columns, _resolve_dims from parquet Y/X_shape

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DynaCLR-3D-BagOfChannels-v2: z_window=32, yx_patch=256,
  RandSpatialCrop(40,228,228) after affine for Z focus invariance
  + XY translation, CenterCrop(32,160,160) auto-appended.
  batch_size=256, 2 GPUs, 2-day wall time.
- Add dataloader_demo.py: Jupyter-style visualization of raw vs
  augmented anchor/positive batches with per-sample metadata
- Update demo configs and inspection scripts for new pipeline

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
np.nanmin/nanmax fail on scipy sparse arrays. Convert to dense
before computing range stats so the command works on Seurat-exported
anndata zarr stores.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CLI for running evals
- DAG for evals
- yaml files for evals
… 3 base callbacks

   - model/contrastive_encoder_convnext_tiny.yml: ConvNeXt-Tiny class_paths
   - model/dinov3_frozen_mlp.yml: frozen DINOv3 + MLP projection block
   - augmentations/ops_2d_mild.yml: OPS-specific mild augmentation pipeline
   - data/ops_gene_reporter.yml: OPS data defaults (patch sizes, sampling)
- train_linear_classifier() now returns a third value: raw val outputs
  (y_val, y_val_proba, classes) for downstream ROC curve plotting
- orchestrated run-linear-classifiers generates metrics_summary.pdf
  alongside the CSV: bar chart of AUROC/accuracy/F1 + per-task ROC curves
- Delete evaluate_dataset.py (argparse-based, not in CLI, superseded by
  orchestrator) and its example config
- Strip generate_comparison_report and its helpers from report.py;
  file is now CV-only
- Remove dead _detect_n_features() from cross_validation.py
- Update all callers of train_linear_classifier() to unpack 3-tuple
- Update DAG doc and linear classifiers README

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- FOVRecord.channel_markers: dict[str, str] maps zarr channel name to
  marker for a specific well (populated from Airtable channel_N_marker fields)
- ChannelEntry.wells: list[str] restricts a channel to a subset of wells;
  empty means valid in all wells
- build_collection auto-populates wells by comparing which wells have a
  non-None marker for each channel across all FOVRecords
- _build_experiment_tracks skips channel rows where ch.wells is non-empty
  and the current well is not in that set, preventing noise rows from
  mixed-plate experiments (e.g. viral sensor only in B/3, C/2)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The glob */*/* on zarr v3 stores yields zarr.json files (e.g. A/2/zarr.json)
in addition to position directories. The previous check only stripped names
starting with "." (.zattrs, .zgroup) but missed zarr.json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ollection

- DynaCLR-2D-MIP-BagOfChannels: add viral_sensor + Phase3D for
  2025_01_28, 2024_10_09, 2024_10_16; fix dragonfly tracks_path
  to point to inner zarr store (tracking.zarr/2024_08_14_...zarr)
- DynaCLR-3D-BagOfChannels-v2: add viral_sensor + Phase3D for
  2025_01_28, 2024_10_09, 2024_10_16
- DynaCLR-3D-BagOfChannels-v3: new collection copied from v2 with
  dragonfly tracks_path fix; v2 left intact for running training job
- DynaCLR-BoC-lc-evaluation-v1: add viral_sensor for all datasets;
  add Phase3D for 2025_01_28

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wire load_config to delegate to load_composed_config so eval configs
  support base: recipe inheritance (same mechanism as training configs)
- Extract shared eval settings into 4 recipes: predict.yml, reduce.yml,
  plot_infectomics.yml, linear_classifiers_infectomics.yml
- Slim down DynaCLR-2D-BagOfChannels-v3, DynaCLR-2D-MIP-BagOfChannels-v1,
  DINOv3-temporal-MLP-2D-BagOfChannels-v1, and test_evaluation configs
  to use base: references — eliminating copy-pasted 14-experiment
  annotation blocks and shared step configs
- Fix ONNX inference to use GPU (CUDAExecutionProvider) and suppress
  pthread_setaffinity_np noise with intra/inter_op_num_threads=1
- Switch CTC tracking SLURM script to gpu partition

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix \bbf[\b_] -> \bbf(\b|_): inside a character class, \b is a
  backspace character, not a word boundary
- Add \bphc\b to detect phase-contrast (PhC) as label-free

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
pandas 3+ uses Arrow-backed strings by default, which breaks anndata's
zarr writer. Apply the same fix in two code paths:
- embedding_writer.py: replace select_dtypes("string") with per-column
  isinstance checks for pd.StringDtype and Arrow-backed Categoricals
- zarr_utils.py: convert ArrowStringArray columns and index to object
  dtype before calling append_to_anndata_zarr

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PHATE: default n_jobs from -1 (all cores) to 1 to prevent hogging
  shared SLURM nodes; exposed in PHATEConfig and compute_phate()
- Annotation: support (fov_name, t, track_id) join as fallback when
  both sides lack an 'id' column; normalize fov_name by stripping
  leading/trailing slashes to prevent join mismatches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
For multiclass problems, compute one-vs-rest AUROC per class and report
as val_{class_name}_auroc columns in the results DataFrame.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- viscy-utils: add onnx, onnxscript to core deps; copairs to eval extras
- dynaclr: add tracking optional group (gurobipy, onnxruntime-gpu,
  py-ctcmetrics, tabulate, tracksdata) for CTC tracking benchmark
- Regenerate uv.lock

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- index.py: replace O(N*tau) Python loop in _compute_valid_anchors with
  vectorized pd.MultiIndex.isin(); add fit=False predict-mode fast path
  that skips anchor computation; add precomputed_valid_anchors to
  clone_with_subset() to avoid redundant recomputation; accept
  cell_index_df to avoid double-reading parquet
- dataset.py: replace per-row loops in _build_match_lookup with
  groupby().indices; skip lookup build in predict mode; add organelle,
  well, microscope to exported metadata columns
- datamodule.py: tune defaults (num_workers=4, cache_pool=500MB,
  pin_memory=True, buffer_size=4); use vectorized MultiIndex.isin for
  FOV split; reuse pre-loaded cell_index_df from ExperimentRegistry
- experiment.py: from_cell_index returns (registry, dataframe) tuple
  so callers can reuse the DataFrame without re-reading from disk

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use .get() with None default for transcriptome_anndata and skip the
barcode join when it is absent, allowing embeddings on datasets that
lack paired scRNA-seq.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Centralize cell_index_path to shared /hpc/projects/.../collections/
  dir across all training configs
- MIP model: extend z_extraction_window 11->20, z_focus_offset 0.5->0.3,
  yx_patch_size 192->256, add BatchedRandSpatialCropd for Z-invariance
- 3D BoC: num_workers 2->4; SLURM time limit 2d->4d
- Collection: mark DynaCLR-2D-BagOfChannels-v3 as [LEGACY]; fix well
  assignments in BoC-lc-evaluation-v1 (add A/1 for 07_24, remove
  incorrect B/1 and B/2 from 01_28)
- Add new collections: annotated MIP subset, test subset, alfi-eval
  (ALFI mitosis, 3 cell lines), microglia-eval (5 perturbations),
  benchmark_2exp (dataloader profiling)
- predict.yml: add TQDMProgressBar callback (refresh_rate=10)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- evaluate.py: remove all SLURM script generation (_generate_*_sh,
  _slurm_header, _run_local*); replace with prepare_configs() that
  generates YAML configs and prints a JSON manifest to stdout; rename
  CLI command evaluate -> prepare-eval-configs; add MMD config generators
- evaluate_config.py: remove SlurmConfig; add MMDStepConfig and
  ComparisonSpec imports; split PlotStepConfig.color_by into per-exp
  and combined_color_by; update TaskSpec.marker_filters docstring for
  auto-expand behaviour
- cli.py: add prepare-eval-configs, check-evals, append-annotations,
  append-predictions, split-embeddings, compute-mmd, plot-mmd-heatmap,
  evaluate-tracking-accuracy commands
- split_embeddings.py: new CLI to split combined embeddings.zarr by
  experiment, replacing inline SLURM script logic
- check_evals.py: new CLI to print eval completion status from registry
- eval_registry.yaml: declarative registry of models to evaluate
- Delete 4 stale SLURM-era eval configs (SlurmConfig schema removed)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three modes for measuring embedding-space distribution shifts:
- Per-experiment (explicit comparison pairs, faceted by marker)
- Combined (pairwise cross-experiment with batch centering)
- Pooled (concatenates all experiments, BH FDR correction)

Core implementation:
- viscy_utils/evaluation/mmd.py: kernel MMD with median heuristic,
  Gaussian RBF kernel, unbiased estimator, and vectorized permutation
  test (avoids Python loops via binary label matrix multiplication)
- viscy_utils/evaluation/embedding_map.py: mAP via copairs for
  phenotypic profiling (optional dependency)
- evaluation/mmd/config.py: Pydantic config hierarchy for all three
  modes; temporal binning, shared bandwidth, balance_samples
- evaluation/mmd/compute_mmd.py: orchestrates the three analysis modes;
  computes activity_zscore = (mmd2 - null_mean) / null_std for
  cross-marker comparability; outputs per-marker CSV files
- evaluation/mmd/plotting.py: kinetics lines, heatmaps, activity
  z-score heatmaps, combined cross-experiment heatmaps, multi-panel
  grids, paired heatmaps with shared colorbar
- configs/evaluation/recipes/mmd_defaults.yml: shared algorithm defaults
  (1000 permutations, max 2000 cells, seed 42) for YAML inheritance
- tests/test_mmd.py: unit tests for MMD implementation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ver-time

- orchestrated.py: when marker_filters is None, auto-discover all unique
  obs["marker"] values and run one classifier per marker; save trained
  pipelines as {task}_{marker}.joblib with manifest.json; add
  _plot_f1_over_time for per-class F1 at each timepoint; output one
  {task}_summary.pdf per task (was a single merged PDF)
- orchestrated_test.py: update fixtures to expect 2 rows per task with
  auto-expansion; add test for sparse-marker skipping and F1-over-time
  plot generation
- append_annotations.py: new CLI to persist ground-truth annotation
  columns directly into per-experiment zarr obs
- append_predictions.py: new CLI to apply saved classifier pipelines to
  all cells in per-experiment zarrs, writing predicted_{task} to obs and
  predicted_{task}_proba to obsm

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When group_by is set (default "marker"), evaluate_smoothness iterates
over unique group values, computes smoothness per group, saves per-group
CSV, generates per-group plots, then aggregates via mean/std. Output
filenames now include experiment_name for disambiguation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Evaluates whether DynaCLR embeddings improve cell tracking on Cell
Tracking Challenge datasets vs an IoU baseline.

- tracking_accuracy/config.py: Pydantic models for ONNX model entries,
  CTC dataset entries, ILP solver weights, and full benchmark config
- tracking_accuracy/utils.py: seg_dir layout helper, pad_to_shape,
  normalize_crop (z-score using whole-frame statistics)
- tracking_accuracy/evaluate_tracking.py: main benchmark driver
- ctc_tracking_2d_mip_boc.yaml: DynaCLR-2D-MIP vs IoU on DIC-C2DL-HeLa
- ctc_tracking_2d_mip_boc_all.yaml: all CTC sequences variant
- export_onnx_2d_mip_boc.yml: config for exporting the MIP model to ONNX

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pairplot: change diag_kind kde -> hist; rasterize scatter points to
  prevent PDF bloat; improve legend (alpha=1.0, larger marker sizes)
- Scatter 2D: improve legend (markerscale=6, fontsize=10, framealpha=1.0,
  edgecolor="black")

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
edyoshikun and others added 30 commits May 5, 2026 14:36
os.cpu_count() returns the node's physical core count, not the
SLURM-allocated count. On a 48-core node where SLURM gave us 16,
ad-hoc users of os.cpu_count() oversubscribe. Centralize the
SLURM_CPUS_PER_TASK fallback in viscy_utils.mp_utils.available_cpus
and route MultiExperimentDataModule's tensorstore concurrency
through it.

Pin BLAS to 1 thread per process in REDUCE_COMBINED — PHATE's
joblib n_jobs spawns one worker per allocated CPU, so unbounded
BLAS would yield ~cores^2 threads. Standard sklearn parallelism
pattern (one axis at a time).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PHATE's internal PCA pre-reduction (graphtools -> sklearn ->
scipy.linalg.lu) deadlocks silently on scipy 1.17.1 + sklearn
1.8.0 — process sits at ~0% CPU forever. Wire X_pca_combined
back into PHATE so it skips its own pre-PCA: when phate.n_pca is
null, fit on the already-reduced PCA output instead of raw .X.

Add caller-owned fit-set indexing (fit_idx) to
viscy_utils.evaluation.compute_phate so the orchestrator can
draw a per-store lineage cap. Whole lineages are sampled per
store (cap=N each); PHATE fits on the union and transforms the
full input. Re-enables PHATE in the eval recipe with a 100-cell
per-store cap for fast iteration; bump for paper figures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-modal InfoNCE head pulls image features toward a paired
per-cell vector target (e.g. transcriptomic embedding). Image
and target sides each pass through a small projector into a
shared space; samples whose target contains NaN (unpaired
cells) are masked out so the head runs on partially-paired
batches.

Extend ContrastiveModule._get_labels to handle vector-valued
metadata: list/tuple/array entries are stacked into (B, D) float
tensors, scalars stay as (B,) long tensors. Required for the
new head's paired-target lookup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CELL-DINO is a DINOv2-architecture ViT pretrained on
fluorescence microscopy (Human Protein Atlas). The
channel_adaptive_dino_vitl16 checkpoint processes one channel
at a time through a single-channel ViT-L/16 stem; the wrapper
reshapes (B, C, H, W) -> (B*C, 1, H, W), runs the backbone, and
mean-pools the cls token across channels for a fixed-dim
embedding regardless of input channel count. Weights load from
a local .pth state_dict — nothing fetched from the network.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PCA pairplot rendering is per-coloring-variable independent;
fan out across colorings using joblib loky workers (one worker
per coloring, capped by available_cpus). Workers re-import
matplotlib + seaborn (~1s overhead) so the gain only kicks in
for pairplot_components >= 4 on >100k cells, which matches the
paper-figure config.

Add the pairplot_components knob to the infectomics recipe at
4 (PC1..PC4 grid = 16 panels per coloring); bump higher for
final paper figures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fixed coupling between PLOT and PLOT_COMBINED forced every
infectomics run to fan out per-experiment scatter even when only
the combined figure was needed. Make both stages
independently togglable via steps:; the Nextflow DAG already
checks `steps`, just had hardcoded behaviour assuming both
always run.

Switch infectomics-annotated to plot_combined only — the
per-experiment scatter doesn't carry into paper figures.

Drop the redundant marker_filters on cell_death_state
(applies to all markers; the filter was leftover from when LC
was only run on G3BP1/SEC61B sensors).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Job 31449149 OOM'd in cgroup on rank 3 host RAM (not VRAM) on
the 384² single-marker variant. Loader prefetch buffers scale
with workers × prefetch_factor, not batch_size.

- Drop prefetch_factor 2→1 in the BoC base config — halves
  in-flight batches per worker, restores earlier behaviour.
- Drop the 192 sbatch from 4→2 GPUs and bump --mem-per-cpu
  14G→17G (255 GB/rank, 510 GB/node) so each rank has more
  headroom; also eases queue priority. Pin trainer.devices=2
  in the override yml so the Lightning config matches.

Batch size kept at 256/rank — host RAM was the OOM driver.
If this still OOMs, suspect a real leak (loky semaphores,
tensorstore decoder scratch) rather than papering over with
more RAM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same physical microglia cells appeared three times in the
collection (BF, Phase3D, Retardance), tripling the experiment's
row count and biasing marker/experiment sampling without adding
biological signal. Keep Phase3D only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add status legend (✅ landed / 🔄 running / ⬜ pending) and
inline notes per model so the registry reads as a state-of-the
bake-off. Stable name strings ensure the model→color palette
matches across infectomics-annotated, alfi, and microglia
registries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the primary_analysis.csv + cage_crop parsing path with
a direct read from the cells AnnData zarr (dinov2.zarr / rna.zarr
under a shared anndata_dir). The fov_name column is the zarr
path; load_cells_anndata returns it as zarr_path so the rest of
the pipeline is unchanged.

Split CLI: data_paths.yml carries the shared zarr_store +
anndata_dir + output_dir, and embed_<model>.yml carries
per-model config (channels, output_key, target_pixel_size,
batch_size). Both files are merged at startup.

Add a max_cells smoke-test knob that truncates the cell table
post-filter for fast iteration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- viscy-utils: pin anndata<0.12.9 across all/anndata/dev/test extras
  (matches pyproject; the constraint was added but the lock hadn't
  been refreshed)
- viscy: normalize gurobipy specifier to the same range
- nvidia-* and cuda-bindings: add platform_machine != 's390x'
  markers per uv solver auto-update

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PCA-RGB timelapse MP4 export needs imageio's FFmpeg plugin;
without it the timelapse CLI silently falls back to GIF.
Bundle matplotlib so the visualization helpers don't pull it
through a transitive eval-extra dependency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EmbeddingWriter's primary array (adata.X) is hard-coded to the
encoder backbone "features". DINOv3-temporal-MLP and similar
frozen-backbone-with-trained-head models put all the learned
task signal in the projection head — predicting features in
that case discards the only learned component.

Add a predict.embedding_key knob ("features" | "projections")
that the eval orchestrator threads into the generated predict
YAML. The unselected array remains as a sidecar in obsm.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The MLP head is the only finetuned component — the DINOv3
backbone is frozen during training. Defaulting to features would
make this row a duplicate of DINOv3-frozen and discard the only
learned task signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Override on top of DynaCLR-2D-MIP-BagOfChannels.yml that flips positive
sampling to fully self-supervised: anchor and positive are the same crop
pre-augmentation, view diversity comes from the augmentation pipeline.

Same v3 parquet, same single-marker batches, same marker-uniform
group_weights as the temporal -single-marker variant — only the
positive-sampling strategy differs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4-GPU h100|h200 DDP launcher mirroring the temporal -single-marker.sh.
RUN_NAME tagged with 'classical' for clean WandB separation; no
warm-start checkpoint (fresh init since the classical and temporal
variants are independent training runs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verifies the classical override against the real v3 parquet:

1. Self-mode short-circuits index lookup (no _lineage_timepoints,
   no _match_lookup built).
2. positive == anchor.clone() pre-augmentation, with anchor_meta
   and positive_meta matching 1:1.
3. Augmentation pipeline runs end-to-end on both keys independently
   (shape 16x256x256 -> 1x160x160, NormalizeSampled centers to ~0,
   views diverge with mean |a-p| ~= 0.75 * anchor_std).
4. batch_group_by: marker enforces single-marker batches; over 30
   batches the sampler rotates through all 9 configured markers
   with marker-uniform weighting working as designed.
5. stratify_by: experiment mixes ~3 experiments per single-marker
   batch, preventing dynamorph domination of Phase3D batches.

Run:
  uv run python applications/dynaclr/scripts/dataloader_inspection/test_classical_self_positives.py

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
sbatch copies the script into the spool dir, so dirname "$0" resolves
to /var/spool/slurm/job<id>/ — the relative ../slurm/train.sh path
then doesn't exist and the job fails with exit 1 in <1s. Switch to
the absolute repo path (matches the working -single-marker-192.sh
launcher).

Job 32371268 hit this; resubmitting after fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same fix as the classical launcher (9474a5f): sbatch copies the
script into /var/spool/slurm/job<id>/ so dirname "$0" doesn't resolve
to the repo. Job 32535968 hit this when resuming from epoch 52.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the DINOv3 leaf with the channel-adaptive ViT-L/16 backbone
(1024-dim cls token after channel mean-pool). Uses the v3 parquet
collection. SLURM launcher uses absolute path to train.sh.
Mirrors the dynacell-models pattern. recipes/trainer/fit.yml holds
logger + callbacks + log cadence; recipes/topology/{single_gpu,
ddp_2gpu,ddp_4gpu,ddp_8gpu}.yml hold accelerator/strategy/devices.
Leaves migrate from re-declaring strategy/devices inline to composing
both fragments via base:.
Each leaf now composes recipes/trainer/fit.yml + a recipes/topology/
fragment instead of recipes/trainer.yml plus inline strategy/devices.
Composed configs are byte-identical before and after (verified by
snapshot diff across all 10 leaves).
All leaves now compose recipes/trainer/fit.yml + recipes/topology/
fragments. Old monolithic recipe is unreferenced. Docs updated to
show the orthogonal-axis layout.
Lightning subcommands are now handled by viscy_utils.cli.main, which
performs base: recipe composition before LightningCLI parses the
resolved config. Click subcommands (eval tooling) are unchanged.

Mirrors the dynacell __main__.py routing pattern (without the Hydra
branch — dynaclr eval tooling stays Click-based).
The new dynaclr fit entry point composes base: recipes via
viscy_utils.cli.main and is the place future resolver hooks will hang.
limit_train_batches: 800 / limit_val_batches: 200. The prior unbounded
run (job 32536880) was host-OOM-killed at 2h elapsed; the leak is in
training-step logging accumulating MetaTensors and the cap also
provides regular val checkpoints.
CELL-DINO ViT-L/16 upsamples 160² patches to 224² before forward,
which doubled per-batch host RAM vs. ConvNeXt-tiny at the same
batch_size. Combined with random plate sampling (zero cache hit rate,
each tensorstore.Context costs ~500 MB) the BoC parquet pegged the
240 GB cgroup ceiling within an hour at bs=512/nw=2 (jobs 32536880
and 32568006).

Match the DynaCLR-2D-MIP-BagOfChannels tunings:
- bs=256/nw=4 (smaller batches, more workers each holding less queue)
- prefetch_factor=1, buffer_size=1
- cache_pool_bytes=0
- file_io_concurrency=32
The orchestrated LC trainer used sklearn.train_test_split at the cell
level, which puts cells from the same track in both train and val. For
temporal-contrastive SSL embeddings (DynaCLR) this inflates val AUROC
by 1–20 points per channel because the SSL was trained to pull
same-track embeddings together.

split_groups_by takes a list of obs columns (default
[experiment, fov_name, track_id]); when set, the splitter switches to
GroupShuffleSplit so no group lands in both halves. Default is now
baked into the infectomics LC recipe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
compare_evals.py gains _load_infection_kinetics + _plot_infection_kinetics:
percent infected over time per model, with a single shared annotation
curve (canonical bundle picked by lowest pairwise disagreement) and one
solid colored curve per channel restricted to annotated cells for 1:1
amplitude comparison. Outputs infection_kinetics.{pdf,csv}.

Adds time_window: [lo, hi] to the registry YAML; when set, the kinetics
plot clips before binning so the cohort composition stays constant
across time bins (ZIKV cohort ends ~25h, DENV runs to 66h — without
clipping, late bins are dominated by the long DENV runs).

Adds palette_anchor for stable model→color mapping across registries.

eval_registry_infectomics_grouped.yaml is the companion registry that
reads from each model's <eval_dir>_grouped/ sibling and uses
time_window: [4.0, 25.0].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants