Skip to content

[Issue]: Significant compilation time degradation when upgrading from eb603d0 to ac23987 #4688

@mferencevic

Description

@mferencevic

Problem description

We noticed that model compilation times have risen significantly when we upgraded from MIGraphx eb603d0 to 0681c63.
We've bisected MIGraphX and narrowed down the issue to this specific MIGraphX commit: ac23987

Specifically, with eb603d0 we were able to compile 10 of our models in ~45 min (which is still not ideal compared to TensorRT where the same compilation takes around 10 min), but it is usable.
When we upgrade MIGraphX to ac23987 the compilation is around 10 times slower and we were only able to compile 3 of our 10 models in 60 min which is unusable.

Because the commit ac23987 is a simple bump in the version of the compiled rocMLIR library, we have written a script that re-compiles MIGraphX ac23987 with a bisected version of rocMLIR.
You can see the output of our bisect log here (the commit tags are tags from the rocMLIR repository):

Bisecting: 8 revisions left to test after this (roughly 3 steps)
Compiling MIGraphX with ac0dcc901b3d9aacda8030ec75ced17b0c35cbd8 ...
Compiling model ...
Compilation duration: 27.16 s
Benchmark duration: 4.62 s -> throughput = 124.1
GOOD!

Bisecting: 4 revisions left to test after this (roughly 2 steps)
Compiling MIGraphX with 11d5c9db8dec5f198e80bab10c59549f401e6c8f ...
Compiling model ...
Compilation duration: 27.26 s
Benchmark duration: 4.68 s -> throughput = 122.6
GOOD!

Bisecting: 2 revisions left to test after this (roughly 1 step)
Compiling MIGraphX with 15ab5c900205d32e6b0591642cc3494ec14c34ef ...
Compiling model ...
Skipped 4 configs for gpu::mlir_op
Skipped 4 configs for gpu::mlir_op
Skipped 4 configs for gpu::mlir_op
Skipped 4 configs for gpu::mlir_op
Skipped 4 configs for gpu::mlir_op
Skipped 4 configs for gpu::mlir_op
Skipped 4 configs for gpu::mlir_op
Skipped 2 configs for gpu::mlir_op
Compilation duration: 212.06 s
Benchmark duration: 4.48 s -> throughput = 128.1
BAD!!!

Bisecting: 0 revisions left to test after this (roughly 0 steps)
Compiling MIGraphX with 51df5f49afac8ec2ed5cdc623affc6685c651c6f ...
Compiling model ...
Compilation duration: 27.39 s
Benchmark duration: 4.66 s -> throughput = 123.0
GOOD!

15ab5c900205d32e6b0591642cc3494ec14c34ef is the first bad commit
commit 15ab5c900205d32e6b0591642cc3494ec14c34ef
Author: Mirza Halilčević <109971222+mirza-halilcevic@users.noreply.github.com>
Date:   Wed Mar 4 17:07:37 2026 +0100

    [AIROCMLIR-44] Update quick-tune lists for gemm and conv (#2212)
    
    * Update gfx950 quick-tune lists for gemm and conv.
    
    * Update gfx942 quick-tune lists for gemm and conv.
    
    * Update gfx90a quick-tune lists for gemm and conv.
    
    * Update gfx908 quick-tune lists for gemm and conv.
    
    * Update gfx1201 quick-tune lists for gemm and conv.
    
    * Update gfx1101 quick-tune lists for gemm and conv, and delete old
    gfx1100 lists.
    
    * Update gfx1150 quick-tune lists for gemm.
    
    * Update gfx1150 quick-tune lists for conv.

 .../Dialect/Rock/Tuning/QuickTuningPerfconfigs.inc | 2525 +++++++++++++-------
 mlir/test/CAPI/mixr_full.c                         |    2 +-
 mlir/test/Dialect/Rock/affix_tuning_params.mlir    |   76 +-
 .../noTransA-noTransB/broadcasted-k-e2e.mlir       |    2 +-
 .../noTransA-transB/broadcasted-k-e2e.mlir         |    2 +-
 .../gemm-layouts/transA-noTransB/gemm-k-e2e.mlir   |    2 +-
 .../gemm-layouts/transA-noTransB/sliced-k-e2e.mlir |    2 +-
 .../transA-noTransB/unitdim-m-e2e.mlir             |    2 +-
 .../gemm-layouts/transA-transB/gemm-k-e2e.mlir     |    2 +-
 .../gemm-layouts/transA-transB/sliced-k-e2e.mlir   |    2 +-
 10 files changed, 1770 insertions(+), 847 deletions(-)

We have identified that the issue in rocMLIR is this commit: ROCm/rocMLIR@15ab5c9

As you can see, the compilation time has increased 10 times (from 27.16 s to 212.06 s), while the throughput of the compiled model has barely changed (from 124.1 to 128.1).

Also, as a side note, this is when our logging issues also began that are being tracked in #4665.

Environment

OS: Debian GNU/Linux 12 (bookworm)
CPU: AMD Ryzen 9 9950X 16-Core Processor
GPU: AMD Radeon AI PRO R9700
ROCm version: 7.2.0
MIGraphX version: 2.16.0.dev+20250912-17-291-gac2398773

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions