Skip to content

fix: compute micro_auroc correctly and fix CPU-bound label tensor conversion#165

Open
PuneetKumar1790 wants to merge 1 commit intoML4SCI:mainfrom
PuneetKumar1790:fix/eval-micro-auroc-and-cpu-label-tensor
Open

fix: compute micro_auroc correctly and fix CPU-bound label tensor conversion#165
PuneetKumar1790 wants to merge 1 commit intoML4SCI:mainfrom
PuneetKumar1790:fix/eval-micro-auroc-and-cpu-label-tensor

Conversation

@PuneetKumar1790
Copy link

@PuneetKumar1790 PuneetKumar1790 commented Mar 1, 2026

Fixes #164

Summary

This PR fixes two bugs in the Classification Transformers sub-project (DeepLense_Classification_Transformers_Archil_Srivastava):

Bug 1: micro_auroc always returns NaN

micro_auroc was initialized as an empty list and never populated. np.mean([]) always returned nan, corrupting W&B experiment logs.

Fix: Compute micro_auroc using auroc_fn(logits, y, num_classes=NUM_CLASSES, average="weighted").

Bug 2: CPU-bound label tensor conversion

labels.type(torch.LongTensor) always creates a CPU tensor, regardless of the current device. In train.py it was immediately moved back with .to(device), wasting an allocation. In eval.py it was never moved to device at all.

Fix: Use labels.to(device, dtype=torch.long) which converts dtype and moves to device in one step.

Files Changed

  • eval.py — Fixed micro_auroc computation and label device placement
  • train.py — Fixed label device placement
  • tests/test_eval_and_train.pyNew verification script with 5 test cases

Verification

All 5 tests pass locally (CPU, Python 3.10, PyTorch 2.10.0, torchmetrics 1.8.2):

TEST 1a: Bug confirmed: original code produces nan        [PASS]
TEST 1b: micro_auroc (fixed) = 0.5284                    [PASS]
TEST 2:  labels correctly placed on target device         [PASS]
TEST 3:  train_step loss = 1.3436                         [PASS]
TEST 4:  Full evaluate integration test                   [PASS]
RESULTS: 5 passed, 0 failed

Note on diff size: The +291 additions come mostly from the new verification test script (tests/test_eval_and_train.py) with setup, before/after comparisons, and integration tests. The actual fixes in eval.py and train.py are small (~15–20 lines changed total).

…conversion

Bug 1: micro_auroc was initialized as an empty list and never computed. np.mean([]) always returned nan, corrupting W&B experiment logs. Renamed to weighted_auroc and computed using auroc_fn with average='weighted' (torchmetrics multiclass AUROC does not support average='micro').

Bug 2: labels.type(torch.LongTensor) always creates a CPU tensor regardless of current device. In train.py it was immediately moved back via .to(device), wasting an allocation. In eval.py it was never moved to device at all. Fixed by using labels.to(device, dtype=torch.long).

Added tests/test_eval_and_train.py with 5 test cases verifying both fixes.
@PuneetKumar1790 PuneetKumar1790 force-pushed the fix/eval-micro-auroc-and-cpu-label-tensor branch from 212507f to eb055f5 Compare March 1, 2026 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] micro_auroc always NaN and CPU-bound label tensor in Classification Transformers

1 participant