Skip to content

Fixes PyTorch 2.12-related incompatibility in fengwu model#1677

Merged
coreyjadams merged 1 commit into
NVIDIA:mainfrom
peterdsharpe:psharpe/more-torch-212-fixes
May 27, 2026
Merged

Fixes PyTorch 2.12-related incompatibility in fengwu model#1677
coreyjadams merged 1 commit into
NVIDIA:mainfrom
peterdsharpe:psharpe/more-torch-212-fixes

Conversation

@peterdsharpe
Copy link
Copy Markdown
Collaborator

PhysicsNeMo Pull Request

Description

Update fengwu_output.pth binary file to fix backwards-incompatible changes with PyTorch 2.12. See #1648 for similar issues; this one has the same failure mode.

Previous error:

FAILED test/models/fengwu/test_fengwu.py::test_fengwu_forward[cpu] - AssertionError: assert False
 +  where False = <function validate_forward_accuracy at 0x7fa3bd12f560>(Fengwu(\n  (encoder_surface): EncoderLayer(\n    (patchembed2d): PatchEmbed2D(\n      (pad): ZeroPad2d((0, 0, 0, 0))\n    ...atchrecovery2d): PatchRecovery2D(\n      (conv): ConvTranspose2d(384, 37, kernel_size=(4, 4), stride=(4, 4))\n    )\n  )\n), (tensor([[[[-8.7801e-01,  6.3541e-02,  3.9758e-01,  ...,  4.0209e-01,\n            6.2170e-01,  5.6378e-01],\n          ...1e+00],\n          [ 6.9760e-01,  2.6290e-01, -1.6820e-01,  ...,  4.2989e-01,\n           -1.6377e-01,  5.3725e-01]]]]),), atol=0.005, file_name='models/fengwu/data/fengwu_output.pth')
 +    where <function validate_forward_accuracy at 0x7fa3bd12f560> = common.validate_forward_accuracy

Checklist

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 27, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@peterdsharpe
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

@peterdsharpe peterdsharpe requested a review from ktangsali May 27, 2026 17:28
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 27, 2026

Greptile Summary

This PR updates the binary reference output file fengwu_output.pth used by test_fengwu_forward to reflect numerical differences introduced by PyTorch 2.12's backwards-incompatible changes, mirroring a fix already applied for similar models in #1648.

  • The .pth file is a serialized tensor used as the ground-truth comparison in validate_forward_accuracy (with atol=5e-3). The file size is identical before and after, confirming tensor shapes are unchanged — only the numerical values differ due to PyTorch internals.
  • No model code, test logic, or tolerance thresholds are modified; the fix is entirely contained in the reference data file.

Important Files Changed

Filename Overview
test/models/fengwu/data/fengwu_output.pth Binary reference output file regenerated for PyTorch 2.12 compatibility; file size unchanged (1551261 bytes), indicating tensor shapes are preserved.

Reviews (1): Last reviewed commit: "Update fengwu_output.pth binary file" | Re-trigger Greptile

Copy link
Copy Markdown
Collaborator

@ktangsali ktangsali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@peterdsharpe peterdsharpe enabled auto-merge May 27, 2026 17:43
@peterdsharpe
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

1 similar comment
@peterdsharpe
Copy link
Copy Markdown
Collaborator Author

/blossom-ci

@peterdsharpe
Copy link
Copy Markdown
Collaborator Author

/ok to test d62add6

@peterdsharpe peterdsharpe added this pull request to the merge queue May 27, 2026
@coreyjadams coreyjadams removed this pull request from the merge queue due to a manual request May 27, 2026
@coreyjadams coreyjadams merged commit bed3b89 into NVIDIA:main May 27, 2026
6 checks passed
@peterdsharpe peterdsharpe deleted the psharpe/more-torch-212-fixes branch May 27, 2026 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants