Test coverage gaps: recent MIGraphX lowering passes and CI flakiness (Mar 2026)

## Context

This is an automated test-coverage analysis triggered by PR [#2295](https://github.com/ROCm/rocMLIR/pull/2295) (marking `large-kernel-no-scavenge.mlir` as `XFAIL` due to intermittent lowering failure). While that PR is a CI workaround, it and the surrounding recent merges expose several meaningful test coverage gaps across new lowering passes and bug fixes. Below are the areas most worth hardening, ordered by blast radius.

---

## 1. 🔥 `arith.maximumf/minimumf` — No behavioral test for the "don't expand" pipeline option

**Commit:** `e40a31807f51` — *[EXTERNAL] Stop expanding float min/max ops*

**What changed:** `Pipelines.cpp` now sets `includeFloatMinMax = false` so that `arith.maximumf / arith.minimumf / arith.maxnumf / arith.minnumf` are **not** expanded into compare-and-select sequences, relying instead on the AMDGPU backend's native `v_max_*/v_min_*` instructions.

**Current test coverage:** `mlir/test/rocmlir-driver/pipelines.mlir` only checks that the printed pipeline string contains `include-float-min-max=false`. There is no test that:
- Verifies `arith.maximumf` is **preserved** (not expanded) when flowing through the rocMLIR full pipeline.
- Validates the NaN semantics difference: the old expand path propagates NaN from either operand; the native instruction may have different IEEE handling on specific GFX targets.
- Checks `arith.minnumf / arith.maxnumf` (which have "num" semantics — propagate NaN only from `lhs`).

**Risk:** If the backend does not handle these ops for a target, compilation fails silently or emits wrong code. NaN-in/NaN-out behavior is a correctness concern for attention masking and fusion kernels.

**Suggested tests to add:**

*File: `mlir/test/rocmlir-driver/large-kernel-float-minmax.mlir`* (new)
```
// RUN: rocmlir-gen ... | rocmlir-driver --kernel-pipeline=full | FileCheck %s
// CHECK-NOT: arith.cmpf
// CHECK-NOT: arith.select
// CHECK: arith.maximumf
```

*File: `mlir/test/Dialect/Arith/expand-ops-amdgpu.mlir`* (new)
- A test running `arith-expand="include-float-min-max=false"` on a function containing all four float min/max ops and verifying they pass through unchanged.
- A test with a NaN input verifying `arith.maximumf(NaN, x) == NaN` vs `arith.maxnumf(NaN, x) == x` when running with `include-float-min-max=false`.

---

## 2. 🔥 Flaky `large-kernel-no-scavenge` test — Root cause untested

**Commit:** `5e908079c7da` — *Mark large-kernel-no-scavenge as XFAIL*

**What happened:** `rocmlir-driver --kernel-pipeline=full` intermittently produces empty output (printing `Lowering failed` to stderr) for a specific `conv_bwd_data` configuration with `--perf_config 'v3:128,64,8,128,64,16,1,1,2,1,1'`. The test was XFAIL'd rather than fixed.

**Risk:** The flakiness signal is being suppressed, not resolved. If the underlying lowering failure is non-deterministic (e.g. a resource race, register pressure corner case, or unhandled fallback in the scavenger-disabled path), it could affect other large convolution or attention kernels in production.

**Suggested tests to add:**

*File: `mlir/test/rocmlir-driver/large-kernel-no-scavenge-error.mlir`* (new)
- A test that explicitly invokes the same gen command and verifies it does **not** print `Lowering failed` to stderr (using `FileCheck --implicit-check-not`).
- A deterministic stress test that runs the same command 3 times and checks all three succeed (a shell `RUN` loop), to surface flakiness early in CI rather than masking it.

Additionally, the `Lowering failed` error path in `rocmlir-driver` itself should be tested:
- Verify that when lowering fails, the driver exits with a non-zero code **and** prints a diagnostic that includes the operation that failed (not just `Lowering failed` with no context).

---

## 3. `migraphx.shaped` parser crash fix — Parse-level errors untested

**Commit:** `839eb350e187` — *[AIROCMLIR-546] Fixed parser crash from invalid `!migraphx.shaped`*

**What changed:** `MIXRShapedType::parse()` in `MIGraphX.cpp` now calls `parser.emitError()` in three places instead of crashing via `get()` when stride/shape counts mismatch.

**Current test coverage:** `mlir/test/Dialect/MIGraphX/invalid.mlir` tests only the **verifier** (`verify()`), not the **parser** (`parse()`). The three new `emitError()` call sites are completely untested:
1. Failure to parse `<`, dimension list, or element type.
2. Failure to parse the stride dimension list in a non-scalar shaped type.
3. Failure to parse the closing `>`.

**Suggested tests to add in `mlir/test/Dialect/MIGraphX/invalid.mlir`:**

```mlir
// -----
// expected-error @+1 {{expected shaped dimension list with type}}
func.func @bad_parse_missing_gt(%arg: !migraphx.shaped<1xf32) { func.return }

// -----
// expected-error @+1 {{expected `,` and a `x`-separated list}}
func.func @bad_parse_missing_stride(%arg: !migraphx.shaped<1xf32>) { func.return }

// -----
// expected-error @+1 {{expected shaped dimension list with type}}
func.func @bad_parse_garbage(%arg: !migraphx.shaped<garbage>) { func.return }
```

**Why it matters:** Without parsing tests, a refactor of `parse()` could silently remove the error handling and restore the crashing behavior.

---

## 4. Broadcasting Linalg lowering — Error paths and edge cases untested

**Commit:** `a8ae8acacbd0` — *[AIROCMLIR-552] Added Broadcasting Linalg Lowering Path*

**What changed:** `BroadcastConverter` and `MultiBroadcastConverter` were rewritten in `MIGraphXToLinalg.cpp` to use `linalg.broadcast` instead of TOSA.

**Current test coverage:** Four tests in `mixr-to-linalg-ops.mlir`: axis=0, 4D multibroadcast, scalar multibroadcast, scalar broadcast.

**Gaps:**

| Missing case | Why it matters |
|---|---|
| `broadcastDimensions.empty()` in `MultiBroadcastConverter` (reshape-only, no broadcast needed) | This branch is taken when the input and output have the same non-unit dims; never tested |
| `arith::ConstantOp` + `DenseElementsAttr::isSplat()` fast path in `MultiBroadcastConverter` | Splat-constant optimization silently broken if `isSplat()` ever returns false for a constant |
| `BroadcastConverter` with `axis > 0` and multi-dimensional input | Code conditionally strips trailing `1` dims; only axis=0 and scalar tested |
| Error path: `"cannot convert output type to ranked tensor type"` | No negative test |

**Suggested file to update:** `mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir`

---

## 5. `migraphx.greater` / `migraphx.equal` — Missing type and error coverage

**Commit:** `712f49ed5447` — *[AIROCMIR-446] Lower `migraphx.greater/equal` into `linalg.generic`*

**What changed:** New `BooleanElementwiseConverter<Greater>` and `BooleanElementwiseConverter<Equal>` in `MIGraphXToLinalg.cpp`.

**Current test coverage:** 5 tests in `migraphx-to-linalg-boolean.mlir` covering `i32`, `si32` (greater only), `f32` for both ops.

**Gaps:**

- `f16` and `bf16` types: these are the dominant compute types in rocMLIR attention and GEMM pipelines; no test verifies `arith.cmpf ogt` + `arith.uitofp` works correctly for `f16` output.
- `migraphx.equal` with `si32` input.
- No test for mismatched input types (should the converter reject or convert?). The code assumes operands share the same element type; if they don't, the `linalg.generic` body would emit a type error deep in lowering rather than a clear diagnostic.
- No rank variation tests (rank-1 and rank-4 tensors).

**Suggested file to update:** `mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-boolean.mlir`

---

## 6. Reshape helper — No-op and collapse-only paths untested

**Commit:** `529789d99c07` — *[AIROCMLIR-564] Lower `migraphx.reshape` using helper function*

**What changed:** The `reshapeValue()` helper in `MIGraphXToLinalg.cpp` has three code paths:
1. Same-shape early return (no-op).
2. Collapse-only (single `CollapseShapeOp`).
3. Collapse + expand (general case, tested).

**Current test coverage:** Only the collapse (2D→3D expand) and expand (3D→2D collapse) cases are tested.

**Suggested tests in `mixr-to-linalg-ops.mlir`:**
- `migraphx.reshape` with identical input/output shape — should return the input value unchanged (no new ops).
- `migraphx.reshape` that only requires a `tensor.collapse_shape` (e.g., `4x4xf32` → `16xf32`).

---

## Summary Table

| Area | File to Add/Update | Priority |
|---|---|---|
| `arith.maximumf` preservation in full pipeline | `mlir/test/rocmlir-driver/large-kernel-float-minmax.mlir` (new) | High |
| `arith-expand` with `include-float-min-max=false` behavioral test | `external/llvm-project/mlir/test/Dialect/Arith/expand-ops.mlir` | High |
| `large-kernel-no-scavenge` deterministic stress + error path | `mlir/test/rocmlir-driver/large-kernel-no-scavenge-error.mlir` (new) | High |
| Parser crash fix — parse-level error paths | `mlir/test/Dialect/MIGraphX/invalid.mlir` | Medium |
| Broadcasting edge cases (empty broadcastDims, splat const, axis>0) | `mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir` | Medium |
| `migraphx.greater/equal` — f16, bf16, equal si32, negative | `mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-boolean.mlir` | Medium |
| `reshapeValue` same-shape no-op + collapse-only | `mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir` | Low |

---

*Generated by automated regression-test coverage analysis on 2026-03-13, triggered by PR #2295.*


[AIROCMLIR-546]: https://amd-hub.atlassian.net/browse/AIROCMLIR-546?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

Missing case	Why it matters
`broadcastDimensions.empty()` in `MultiBroadcastConverter` (reshape-only, no broadcast needed)	This branch is taken when the input and output have the same non-unit dims; never tested
`arith::ConstantOp` + `DenseElementsAttr::isSplat()` fast path in `MultiBroadcastConverter`	Splat-constant optimization silently broken if `isSplat()` ever returns false for a constant
`BroadcastConverter` with `axis > 0` and multi-dimensional input	Code conditionally strips trailing `1` dims; only axis=0 and scalar tested
Error path: `"cannot convert output type to ranked tensor type"`	No negative test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test coverage gaps: recent MIGraphX lowering passes and CI flakiness (Mar 2026) #2297

Context

1. 🔥 `arith.maximumf/minimumf` — No behavioral test for the "don't expand" pipeline option

2. 🔥 Flaky `large-kernel-no-scavenge` test — Root cause untested

3. `migraphx.shaped` parser crash fix — Parse-level errors untested

4. Broadcasting Linalg lowering — Error paths and edge cases untested

5. `migraphx.greater` / `migraphx.equal` — Missing type and error coverage

6. Reshape helper — No-op and collapse-only paths untested

Summary Table

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Area	File to Add/Update	Priority
`arith.maximumf` preservation in full pipeline	`mlir/test/rocmlir-driver/large-kernel-float-minmax.mlir` (new)	High
`arith-expand` with `include-float-min-max=false` behavioral test	`external/llvm-project/mlir/test/Dialect/Arith/expand-ops.mlir`	High
`large-kernel-no-scavenge` deterministic stress + error path	`mlir/test/rocmlir-driver/large-kernel-no-scavenge-error.mlir` (new)	High
Parser crash fix — parse-level error paths	`mlir/test/Dialect/MIGraphX/invalid.mlir`	Medium
Broadcasting edge cases (empty broadcastDims, splat const, axis>0)	`mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir`	Medium
`migraphx.greater/equal` — f16, bf16, equal si32, negative	`mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-boolean.mlir`	Medium
`reshapeValue` same-shape no-op + collapse-only	`mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir`	Low

Test coverage gaps: recent MIGraphX lowering passes and CI flakiness (Mar 2026) #2297

Description

Context

1. 🔥 arith.maximumf/minimumf — No behavioral test for the "don't expand" pipeline option

2. 🔥 Flaky large-kernel-no-scavenge test — Root cause untested

3. migraphx.shaped parser crash fix — Parse-level errors untested

4. Broadcasting Linalg lowering — Error paths and edge cases untested

5. migraphx.greater / migraphx.equal — Missing type and error coverage

6. Reshape helper — No-op and collapse-only paths untested

Summary Table

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. 🔥 `arith.maximumf/minimumf` — No behavioral test for the "don't expand" pipeline option

2. 🔥 Flaky `large-kernel-no-scavenge` test — Root cause untested

3. `migraphx.shaped` parser crash fix — Parse-level errors untested

5. `migraphx.greater` / `migraphx.equal` — Missing type and error coverage