Skip to content

[Issue]: nonmaxsuppression node produces incorrect outputs when nonmaxsupression::use_dyn_output == False #4679

@mferencevic

Description

@mferencevic

Problem description

When nonmaxsupression::use_dyn_output == False the node will return max_num_boxes boxes (equal to num_batches * total_boxes_in_batch_element).
If the nonmaxsupression algorithm selects less than max_num_boxes boxes, the rest of the output will be padded with zeros.
Output size is determined here:

std::vector<std::size_t> out_lens = {max_num_boxes, 3};
Padding with 0 happens here:
std::fill(output.begin(), output.end(), 0);

This means that the first box of the first image in the batch will be included in the output up to max_num_boxes times, while it should be included at most once.

This behavior breaks compatibility with the ONNX NonMaxSupression node (https://onnx.ai/onnx/operators/onnx__NonMaxSuppression.html) whose output only contains the selected boxes and isn't padded. This causes common post-NMS steps, like Top-K filtering based on the scores of the kept boxes, to be completely wrong.

When nonmaxsuppression::use_dyn_output == true you correctly shrink the output to only include the selected boxes:

return result.reshape({output_shape.type(), {num_selected, 3}});
This doesn't happen when nonmaxsuppression::use_dyn_output == false.

There are a couple more problems with nonmaxsuppression::use_dyn_output:

  1. use_dyn_output is never set when using the Python bindings for MIGraphX. nonmaxsuppression::use_dyn_output is set from onnx_parser::use_dyn_output:
    op.from_value({{"use_dyn_output", parser.use_dyn_output}});
    Which is in turn set from onnx_options::use_dyn_output:
    parser.use_dyn_output = options.use_dyn_output;
    This option is never set when using the Python bindings:
    migraphx::onnx_options options;
  2. Even if use_dyn_output were set to true, nonmaxsuppression::compute_shape would fail unless the node's input shapes are also dynamic, which isn't the case when the entire model is compiled with static shapes. If we just change the onnx_options::use_dyn_output (while keeping static input shapes) we get the following error: parse: PARSE_EXPAND: dynamic input tensor with fixed dims input not supported. We often have to compile the whole graph with static shapes, as otherwise we'd get dynamic-shape-related compilation errors.

I'd suggest to completely remove the nonmaxsuppression::use_dyn_output == false code paths, as this option would always produce wrong outputs.
This would also require changing the node so that nonmaxsuppression::use_dyn_output == true works with static input shapes.

Steps to reproduce

This behavior is already encoded in the reference tests, in the nms_test test case:

TEST_CASE(nms_test)

As you can see the expected outputs have been padded with 0, 0, 0 triplets which doesn't conform to the ONNX operator specification.

Environment

OS: Debian GNU/Linux 12 (bookworm)
GPU: AMD Radeon AI PRO R9700
ROCm version: 7.2.0
MIGraphX version: 2.16.0.dev+20250912-17-299-g0681c63ab

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions