Add AMAX, AVG, NORM1, NORM2, MUL, MUL_NO_ZEROS reduction modes by rsuderman · Pull Request #325 · iree-org/fusilli

rsuderman · 2026-04-08T18:04:12Z

Enable the remaining cuDNN reduction modes in ReductionAttr and add the corresponding MLIR schemas to the asm emitter:

NORM1 lowers to abs + sum.dim_IntList.
AMAX lowers to abs + amax.
AVG lowers to mean.dim (float dtypes only — torch.aten.mean.dim is not defined on integer tensors, so the sample skips int32 for AVG).
NORM2 lowers to mul + sum.dim_IntList + sqrt.
MUL lowers directly to torch.prims.prod.
MUL_NO_ZEROS uses aten.ne.Scalar to build an i1 mask, then aten.where.ScalarOther to substitute 1 for zero entries before feeding the result to torch.prims.prod, so zero inputs are excluded from the product.

Extend samples/reduction/reduction_ops.cpp to exercise every new mode. Input data is built by a per-mode generateReductionInputData helper so MUL/MUL_NO_ZEROS get a non-trivial pattern (mostly 1s with a 2 and a 3, plus injected zeros for MUL_NO_ZEROS) that stays in range for fp16/int32, and the expected value is computed by the existing reference reduction loop rather than hardcoded.

Add lit tests for each new mode under tests/lit/ and register them in tests/CMakeLists.txt.

sjain-stanford

Might need rebase / CI seems unclean.

Enable the remaining cuDNN reduction modes in ReductionAttr and add the corresponding MLIR schemas to the asm emitter: - NORM1 lowers to abs + sum.dim_IntList. - AMAX lowers to abs + amax. - AVG lowers to mean.dim (float dtypes only — torch.aten.mean.dim is not defined on integer tensors, so the sample skips int32 for AVG). - NORM2 lowers to mul + sum.dim_IntList + sqrt. - MUL lowers directly to torch.prims.prod. - MUL_NO_ZEROS uses aten.ne.Scalar to build an i1 mask, then aten.where.ScalarOther to substitute 1 for zero entries before feeding the result to torch.prims.prod, so zero inputs are excluded from the product. Extend samples/reduction/reduction_ops.cpp to exercise every new mode. Input data is built by a per-mode generateReductionInputData helper so MUL/MUL_NO_ZEROS get a non-trivial pattern (mostly 1s with a 2 and a 3, plus injected zeros for MUL_NO_ZEROS) that stays in range for fp16/int32, and the expected value is computed by the existing reference reduction loop rather than hardcoded. Add lit tests for each new mode under tests/lit/ and register them in tests/CMakeLists.txt. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Rob Suderman <rob.suderman@gmail.com>

Signed-off-by: Rob Suderman <rob.suderman@gmail.com>

sjain-stanford

LGTM modulo two more comments.

I usually grep for `TODO` to quickly scan open items so adding this here. Nit cleanup from #325. Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>

rsuderman requested a review from IanWood1 April 8, 2026 18:25

IanWood1 mentioned this pull request Apr 8, 2026

[NFC] Refactor reduction emitter to be macro-based #320

Merged

rsuderman force-pushed the reduction_rest branch from 05be4e2 to 2a5541c Compare April 9, 2026 20:58

sjain-stanford requested changes Apr 16, 2026

View reviewed changes

Comment thread include/fusilli/attributes/reduction_attributes.h

Comment thread include/fusilli/support/asm_emitter.h Outdated

Comment thread include/fusilli/support/asm_emitter.h

rsuderman and others added 4 commits May 1, 2026 12:00

Match the templating approach

db4f135

Signed-off-by: Rob Suderman <rob.suderman@gmail.com>

Add ADD back in

f10122f

Update comment convention

4d00ff7

rsuderman force-pushed the reduction_rest branch from 6f488ad to 4d00ff7 Compare May 1, 2026 19:05

rsuderman added 2 commits May 1, 2026 12:06

pre-commit

1ed25fd

Fix duplicate switch statement

7989fe9

rsuderman requested a review from sjain-stanford May 1, 2026 19:42

rsuderman added 2 commits May 1, 2026 12:49

remove debug work

18ebb82

Merge remote-tracking branch 'origin/main' into reduction_rest

736cbfb

sjain-stanford reviewed May 6, 2026

View reviewed changes

Address comments

c19f547

rsuderman requested a review from sjain-stanford May 7, 2026 19:05

sjain-stanford approved these changes May 7, 2026

View reviewed changes

Comment thread include/fusilli/attributes/reduction_attributes.h Outdated

Comment thread samples/reduction/reduction_ops.cpp

Comment thread include/fusilli/support/asm_emitter.h

rsuderman added 2 commits May 7, 2026 14:33

Nits

62df4bb

Merge remote-tracking branch 'origin/main' into reduction_rest

a969d1f

rsuderman merged commit 90053d0 into iree-org:main May 7, 2026
10 checks passed

sjain-stanford mentioned this pull request May 8, 2026

[NFC] Add TODO to surface pending work #405

Merged

sjain-stanford added a commit that referenced this pull request May 8, 2026

[NFC] Add TODO to surface pending work (#405)

3460060

I usually grep for `TODO` to quickly scan open items so adding this here. Nit cleanup from #325. Signed-off-by: Sambhav Jain <sambhav@alumni.stanford.edu>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AMAX, AVG, NORM1, NORM2, MUL, MUL_NO_ZEROS reduction modes#325

Add AMAX, AVG, NORM1, NORM2, MUL, MUL_NO_ZEROS reduction modes#325
rsuderman merged 11 commits into
iree-org:mainfrom
rsuderman:reduction_rest

rsuderman commented Apr 8, 2026

Uh oh!

sjain-stanford left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjain-stanford left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rsuderman commented Apr 8, 2026

Uh oh!

sjain-stanford left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sjain-stanford left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants