Skip to content

python target override for expand phase & combined sdfg pipeline code…#595

Merged
ramonwirsch merged 2 commits intomainfrom
expand-override-hook
Mar 19, 2026
Merged

python target override for expand phase & combined sdfg pipeline code…#595
ramonwirsch merged 2 commits intomainfrom
expand-override-hook

Conversation

@ramonwirsch
Copy link
Copy Markdown
Member

… of mlir and python frontends

  • moved DOCC_CI handling, to live in shared code and now also apply to mlir frontend

… of mlir and python frontends

 + moved DOCC_CI handling, to live in shared code and now also apply to mlir frontend
@ramonwirsch ramonwirsch enabled auto-merge March 17, 2026 17:34
@daisytuner
Copy link
Copy Markdown

daisytuner Bot commented Mar 17, 2026

Daisytuner Report - mlir_torch (chamomile)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# linear_torch           6.16 s      -0.36%      N/A         1463.10 J   +1.26%      
# linear_none            15.89 s     -0.27%      N/A         3957.20 J   +2.33%      
# linear_sequential      15.84 s     +0.33%      N/A         3837.01 J   +3.16%      
# linear_openmp          15.79 s     +0.51%      N/A         3827.20 J   +3.39%      
# linear_cuda            13.58 s     +0.80%      N/A         2569.93 J   +3.83%      
# matmul_torch           6.09 s      -1.68%      N/A         1446.92 J   +0.33%      
# matmul_none            10.72 s     +0.46%      N/A         2829.17 J   +2.03%      
# matmul_sequential      10.58 s     +0.31%      N/A         2847.36 J   +3.17%      
# matmul_openmp          10.52 s     +0.17%      N/A         2822.97 J   +2.94%      
# matmul_cuda            10.23 s     +0.37%      N/A         1944.93 J   +4.20%      

@daisytuner
Copy link
Copy Markdown

daisytuner Bot commented Mar 17, 2026

Daisytuner Report - python_npbench (zinnia)

@@                                   Benchmarks                                   @@
=====================================================================================
  Benchmark              Time        ΔTime       Thr         Energy      ΔEnergy     
=====================================================================================
# adi_numpy              1.31 s      -0.28%      N/A         130.52 J    -0.37%      
# adi_omp                15.93 s     -0.57%      N/A         1503.83 J   -0.66%      
# adi_cuda               4.77 s      -0.33%      N/A         463.25 J    -0.13%      
# adi_seq_tuning         15.98 s     -0.24%      N/A         1507.69 J   -0.37%      
# atax_numpy             2.17 s      +0.61%      N/A         225.78 J    +1.09%      
# atax_omp               2.46 s      -0.48%      N/A         258.88 J    -0.45%      
# atax_cuda              4.13 s      +0.23%      N/A         425.48 J    +0.32%      
# atax_seq_tuning        3.71 s      -0.67%      N/A         375.20 J    -0.66%      
# gemm_numpy             1.23 s      -0.65%      N/A         198.51 J    -0.97%      
# gemm_omp               1.11 s      -0.34%      N/A         162.31 J    -0.44%      
# gemm_cuda              10.58 s     -0.47%      N/A         1005.41 J   -0.45%      
# gemm_seq_tuning        1.12 s      -0.19%      N/A         161.87 J    -0.14%      
# gesummv_numpy          1.73 s      -1.68%      N/A         247.47 J    -1.62%      
# gesummv_omp            5.29 s      -0.92%      N/A         686.29 J    -1.04%      
# gesummv_cuda           8.27 s      -1.09%      N/A         992.76 J    -0.84%      
# gesummv_seq_tuning     6.53 s      -0.90%      N/A         801.03 J    -0.85%      
# gemver_numpy           1.08 s      -0.39%      N/A         165.87 J    -0.56%      
# gemver_omp             712.31 ms   -0.15%      N/A         81.36 J     -0.32%      
# gemver_cuda            3.88 s      -0.03%      N/A         388.41 J    -0.06%      
# gemver_seq_tuning      4.46 s      +0.37%      N/A         431.59 J    +0.32%      
# k2mm_numpy             1.20 s      -0.49%      N/A         197.30 J    -0.52%      
# k2mm_omp               3.61 s      -0.82%      N/A         467.49 J    -0.54%      
# k2mm_cuda              13.54 s     -0.50%      N/A         1280.93 J   -0.57%      
# k2mm_seq_tuning        3.60 s      -0.19%      N/A         463.96 J    -0.42%      
# k3mm_numpy             1.03 s      -0.42%      N/A         183.86 J    -0.57%      
# k3mm_omp               5.73 s      -0.14%      N/A         794.64 J    -0.29%      
# k3mm_cuda              19.81 s     -0.34%      N/A         1864.61 J   -0.54%      
# k3mm_seq_tuning        5.72 s      -0.17%      N/A         791.24 J    -0.44%      
# mvt_numpy              2.42 s      -0.32%      N/A         247.56 J    -0.54%      
# mvt_omp                2.74 s      -0.01%      N/A         284.58 J    -0.04%      
# mvt_cuda               3.36 s      +0.06%      N/A         342.32 J    -0.16%      
# mvt_seq_tuning         2.74 s      -0.05%      N/A         284.54 J    -0.12%      
# symm_numpy             785.92 ms   -0.03%      N/A         80.92 J     -0.12%      
# symm_omp               8.41 s      +0.07%      N/A         801.59 J    +0.01%      
# symm_seq_tuning        8.41 s      +0.02%      N/A         800.97 J    -0.06%      
# syr2k_numpy            891.15 ms   -0.40%      N/A         90.56 J     -0.36%      
# syr2k_omp              9.85 s      -0.08%      N/A         936.46 J    -0.05%      
# syr2k_cuda             1.65 s      -0.89%      N/A         170.78 J    -0.84%      
# syr2k_seq_tuning       9.81 s      -0.22%      N/A         932.93 J    -0.20%      
# syrk_numpy             772.36 ms   -1.61%      N/A         79.57 J     -1.27%      
# syrk_omp               5.93 s      -0.05%      N/A         570.55 J    -0.04%      
# syrk_cuda              1.52 s      -1.04%      N/A         158.54 J    -1.07%      
# syrk_seq_tuning        5.91 s      -0.93%      N/A         567.95 J    -0.96%      
# trmm_numpy             878.71 ms   -1.00%      N/A         89.45 J     -1.01%      
# trmm_omp               3.10 s      -0.88%      N/A         306.26 J    -0.92%      
# trmm_seq_tuning        3.39 s      -2.02%      N/A         322.89 J    -1.45%      

 ~ each test case must set its own global options (register_target..., set_backend_options)
 + fixtures to cleanup global state after every function, to prevent us from accidentally relying on it
 + force_rebuild option on torch_compile to prevent reload from file cache for tests were we want to see the actual compile process
@ramonwirsch ramonwirsch merged commit 5fb101d into main Mar 19, 2026
20 checks passed
@ramonwirsch ramonwirsch deleted the expand-override-hook branch March 19, 2026 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants