Skip to content

eb_hooks modifications for {2025.06}[2024a] TensorFlow v2.18.1#171

Draft
TopRichard wants to merge 5 commits intoEESSI:mainfrom
TopRichard:TensorFlow-v2-18-1
Draft

eb_hooks modifications for {2025.06}[2024a] TensorFlow v2.18.1#171
TopRichard wants to merge 5 commits intoEESSI:mainfrom
TopRichard:TensorFlow-v2-18-1

Conversation

@TopRichard
Copy link
Collaborator

No description provided.

@TopRichard TopRichard marked this pull request as draft February 28, 2026 07:13
@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2
bot: build repo:eessi.io-2025.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Feb 28, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2026.02/pr_171/135255

date job status comment
Feb 28 07:13:35 UTC 2026 submitted job id 135255 awaits release by job manager
Feb 28 07:13:43 UTC 2026 released job awaits launch by Slurm scheduler
Feb 28 07:19:46 UTC 2026 running job 135255 is running
Feb 28 07:51:23 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-135255.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-x86_64-amd-zen2-17722649140.tar.zstsize: 244 MiB (256885174 bytes)
entries: 1752
modules under 2025.06/software/linux/x86_64/amd/zen2/modules/all
Bazel/6.5.0-GCCcore-13.3.0-Java-11.lua
dill/0.3.9-GCCcore-13.3.0.lua
flatbuffers/24.3.25-GCCcore-13.3.0.lua
flatbuffers-python/24.3.25-GCCcore-13.3.0.lua
h5py/3.12.1-foss-2024a.lua
Java/11.0.27.lua
Java/.modulerc.lua
JsonCpp/1.9.5-GCCcore-13.3.0.lua
ml_dtypes/0.5.0-gfbf-2024a.lua
mpi4py/4.0.1-gompi-2024a.lua
nsync/1.29.2-GCCcore-13.3.0.lua
Zip/3.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/x86_64/amd/zen2/software
Bazel/6.5.0-GCCcore-13.3.0-Java-11
dill/0.3.9-GCCcore-13.3.0
flatbuffers/24.3.25-GCCcore-13.3.0
flatbuffers-python/24.3.25-GCCcore-13.3.0
h5py/3.12.1-foss-2024a
Java/11.0.27
JsonCpp/1.9.5-GCCcore-13.3.0
ml_dtypes/0.5.0-gfbf-2024a
mpi4py/4.0.1-gompi-2024a
nsync/1.29.2-GCCcore-13.3.0
Zip/3.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/x86_64/amd/zen2/reprod
Bazel/6.5.0-GCCcore-13.3.0-Java-11/20260228_074012UTC
dill/0.3.9-GCCcore-13.3.0/20260228_072039UTC
flatbuffers/24.3.25-GCCcore-13.3.0/20260228_072931UTC
flatbuffers-python/24.3.25-GCCcore-13.3.0/20260228_074028UTC
h5py/3.12.1-foss-2024a/20260228_072824UTC
Java/11.0.27/20260228_073101UTC
JsonCpp/1.9.5-GCCcore-13.3.0/20260228_074059UTC
ml_dtypes/0.5.0-gfbf-2024a/20260228_074245UTC
mpi4py/4.0.1-gompi-2024a/20260228_072352UTC
nsync/1.29.2-GCCcore-13.3.0/20260228_074312UTC
Zip/3.0-GCCcore-13.3.0/20260228_072020UTC
other under 2025.06/software/linux/x86_64/amd/zen2
2025.06/init/easybuild/eb_hooks.py
Feb 28 07:51:23 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:x86-64-zen2+default
P: latency: 1.38 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:x86-64-zen2+default
P: latency: 2.06 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:x86-64-zen2+default
P: latency: 0.18 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:x86-64-zen2+default
P: bandwidth: 8032.96 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-135255.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@eessi-bot-jsc
Copy link

eessi-bot-jsc bot commented Feb 28, 2026

New job on instance eessi-bot-jsc for repository eessi.io-2025.06-software
Building on: nvidia-grace
Building for: aarch64/nvidia/grace
Job dir: /p/project1/ceasybuilders/eessibot/jobs/2026.02/pr_171/14527698

date job status comment
Feb 28 07:13:36 UTC 2026 submitted job id 14527698 awaits release by job manager
Feb 28 07:13:50 UTC 2026 released job awaits launch by Slurm scheduler
Feb 28 07:14:54 UTC 2026 running job 14527698 is running
Feb 28 07:46:49 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-14527698.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-aarch64-nvidia-grace-17722644010.tar.gzsize: 249 MiB (262133709 bytes)
entries: 1752
modules under 2025.06/software/linux/aarch64/nvidia/grace/modules/all
Bazel/6.5.0-GCCcore-13.3.0-Java-11.lua
dill/0.3.9-GCCcore-13.3.0.lua
flatbuffers/24.3.25-GCCcore-13.3.0.lua
flatbuffers-python/24.3.25-GCCcore-13.3.0.lua
h5py/3.12.1-foss-2024a.lua
Java/11.0.27.lua
Java/.modulerc.lua
JsonCpp/1.9.5-GCCcore-13.3.0.lua
ml_dtypes/0.5.0-gfbf-2024a.lua
mpi4py/4.0.1-gompi-2024a.lua
nsync/1.29.2-GCCcore-13.3.0.lua
Zip/3.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/aarch64/nvidia/grace/software
Bazel/6.5.0-GCCcore-13.3.0-Java-11
dill/0.3.9-GCCcore-13.3.0
flatbuffers/24.3.25-GCCcore-13.3.0
flatbuffers-python/24.3.25-GCCcore-13.3.0
h5py/3.12.1-foss-2024a
Java/11.0.27
JsonCpp/1.9.5-GCCcore-13.3.0
ml_dtypes/0.5.0-gfbf-2024a
mpi4py/4.0.1-gompi-2024a
nsync/1.29.2-GCCcore-13.3.0
Zip/3.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/aarch64/nvidia/grace/reprod
Bazel/6.5.0-GCCcore-13.3.0-Java-11/20260228_073159UTC
dill/0.3.9-GCCcore-13.3.0/20260228_071755UTC
flatbuffers/24.3.25-GCCcore-13.3.0/20260228_072726UTC
flatbuffers-python/24.3.25-GCCcore-13.3.0/20260228_073205UTC
h5py/3.12.1-foss-2024a/20260228_072437UTC
Java/11.0.27/20260228_072751UTC
JsonCpp/1.9.5-GCCcore-13.3.0/20260228_073219UTC
ml_dtypes/0.5.0-gfbf-2024a/20260228_073318UTC
mpi4py/4.0.1-gompi-2024a/20260228_072125UTC
nsync/1.29.2-GCCcore-13.3.0/20260228_073325UTC
Zip/3.0-GCCcore-13.3.0/20260228_071710UTC
other under 2025.06/software/linux/aarch64/nvidia/grace
2025.06/init/easybuild/eb_hooks.py
Feb 28 07:46:49 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:aarch64-nvidia-grace+default
P: latency: 2.47 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:aarch64-nvidia-grace+default
P: latency: 6.15 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:aarch64-nvidia-grace+default
P: latency: 0.25 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:aarch64-nvidia-grace+default
P: bandwidth: 18692.69 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-14527698.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Feb 28, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2026.02/pr_171/135257

date job status comment
Feb 28 08:06:45 UTC 2026 submitted job id 135257 awaits release by job manager
Feb 28 08:07:30 UTC 2026 released job awaits launch by Slurm scheduler
Feb 28 08:08:32 UTC 2026 running job 135257 is running
Feb 28 08:40:28 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-135257.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-x86_64-amd-zen2-17722678580.tar.zstsize: 245 MiB (257039849 bytes)
entries: 1752
modules under 2025.06/software/linux/x86_64/amd/zen2/modules/all
Bazel/6.5.0-GCCcore-13.3.0-Java-11.lua
dill/0.3.9-GCCcore-13.3.0.lua
flatbuffers/24.3.25-GCCcore-13.3.0.lua
flatbuffers-python/24.3.25-GCCcore-13.3.0.lua
h5py/3.12.1-foss-2024a.lua
Java/11.0.27.lua
Java/.modulerc.lua
JsonCpp/1.9.5-GCCcore-13.3.0.lua
ml_dtypes/0.5.0-gfbf-2024a.lua
mpi4py/4.0.1-gompi-2024a.lua
nsync/1.29.2-GCCcore-13.3.0.lua
Zip/3.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/x86_64/amd/zen2/software
Bazel/6.5.0-GCCcore-13.3.0-Java-11
dill/0.3.9-GCCcore-13.3.0
flatbuffers/24.3.25-GCCcore-13.3.0
flatbuffers-python/24.3.25-GCCcore-13.3.0
h5py/3.12.1-foss-2024a
Java/11.0.27
JsonCpp/1.9.5-GCCcore-13.3.0
ml_dtypes/0.5.0-gfbf-2024a
mpi4py/4.0.1-gompi-2024a
nsync/1.29.2-GCCcore-13.3.0
Zip/3.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/x86_64/amd/zen2/reprod
Bazel/6.5.0-GCCcore-13.3.0-Java-11/20260228_082915UTC
dill/0.3.9-GCCcore-13.3.0/20260228_080914UTC
flatbuffers/24.3.25-GCCcore-13.3.0/20260228_081808UTC
flatbuffers-python/24.3.25-GCCcore-13.3.0/20260228_082930UTC
h5py/3.12.1-foss-2024a/20260228_081701UTC
Java/11.0.27/20260228_081938UTC
JsonCpp/1.9.5-GCCcore-13.3.0/20260228_083001UTC
ml_dtypes/0.5.0-gfbf-2024a/20260228_083148UTC
mpi4py/4.0.1-gompi-2024a/20260228_081228UTC
nsync/1.29.2-GCCcore-13.3.0/20260228_083214UTC
Zip/3.0-GCCcore-13.3.0/20260228_080856UTC
other under 2025.06/software/linux/x86_64/amd/zen2
2025.06/init/easybuild/eb_hooks.py
Feb 28 08:40:28 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:x86-64-zen2+default
P: latency: 1.31 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:x86-64-zen2+default
P: latency: 2.05 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:x86-64-zen2+default
P: latency: 0.18 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:x86-64-zen2+default
P: bandwidth: 8013.99 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-135257.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=x86_64/amd/zen2

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Feb 28, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: amd-zen2
Building for: x86_64/amd/zen2
Job dir: /project/def-users/SHARED/jobs/2026.02/pr_171/135259

date job status comment
Feb 28 09:31:50 UTC 2026 submitted job id 135259 awaits release by job manager
Feb 28 09:32:36 UTC 2026 released job awaits launch by Slurm scheduler
Feb 28 09:33:38 UTC 2026 running job 135259 is running
Feb 28 19:46:51 UTC 2026 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-135259.out
✅ no message matching FATAL:
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-x86_64-amd-zen2-17723078650.tar.zstsize: 572 MiB (600383027 bytes)
entries: 17633
modules under 2025.06/software/linux/x86_64/amd/zen2/modules/all
Bazel/6.5.0-GCCcore-13.3.0-Java-11.lua
dill/0.3.9-GCCcore-13.3.0.lua
flatbuffers/24.3.25-GCCcore-13.3.0.lua
flatbuffers-python/24.3.25-GCCcore-13.3.0.lua
h5py/3.12.1-foss-2024a.lua
Java/11.0.27.lua
Java/.modulerc.lua
JsonCpp/1.9.5-GCCcore-13.3.0.lua
ml_dtypes/0.5.0-gfbf-2024a.lua
mpi4py/4.0.1-gompi-2024a.lua
nsync/1.29.2-GCCcore-13.3.0.lua
TensorFlow/2.18.1-foss-2024a.lua
Zip/3.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/x86_64/amd/zen2/software
Bazel/6.5.0-GCCcore-13.3.0-Java-11
dill/0.3.9-GCCcore-13.3.0
flatbuffers/24.3.25-GCCcore-13.3.0
flatbuffers-python/24.3.25-GCCcore-13.3.0
h5py/3.12.1-foss-2024a
Java/11.0.27
JsonCpp/1.9.5-GCCcore-13.3.0
ml_dtypes/0.5.0-gfbf-2024a
mpi4py/4.0.1-gompi-2024a
nsync/1.29.2-GCCcore-13.3.0
TensorFlow/2.18.1-foss-2024a
Zip/3.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/x86_64/amd/zen2/reprod
Bazel/6.5.0-GCCcore-13.3.0-Java-11/20260228_095414UTC
dill/0.3.9-GCCcore-13.3.0/20260228_093420UTC
flatbuffers/24.3.25-GCCcore-13.3.0/20260228_094314UTC
flatbuffers-python/24.3.25-GCCcore-13.3.0/20260228_095430UTC
h5py/3.12.1-foss-2024a/20260228_094207UTC
Java/11.0.27/20260228_094444UTC
JsonCpp/1.9.5-GCCcore-13.3.0/20260228_095500UTC
ml_dtypes/0.5.0-gfbf-2024a/20260228_095647UTC
mpi4py/4.0.1-gompi-2024a/20260228_093734UTC
nsync/1.29.2-GCCcore-13.3.0/20260228_095713UTC
TensorFlow/2.18.1-foss-2024a/20260228_194314UTC
Zip/3.0-GCCcore-13.3.0/20260228_093402UTC
other under 2025.06/software/linux/x86_64/amd/zen2
2025.06/init/easybuild/eb_hooks.py
Feb 28 19:46:51 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:x86-64-zen2+default
P: latency: 1.38 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:x86-64-zen2+default
P: latency: 2.03 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:x86-64-zen2+default
P: latency: 0.18 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:x86-64-zen2+default
P: bandwidth: 7801.26 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-135259.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-jsc for:arch=aarch64/nvidia/grace

@eessi-bot-jsc
Copy link

eessi-bot-jsc bot commented Feb 28, 2026

New job on instance eessi-bot-jsc for repository eessi.io-2025.06-software
Building on: nvidia-grace
Building for: aarch64/nvidia/grace
Job dir: /p/project1/ceasybuilders/eessibot/jobs/2026.02/pr_171/14528727

date job status comment
Feb 28 11:22:17 UTC 2026 submitted job id 14528727 awaits release by job manager
Feb 28 11:22:54 UTC 2026 released job awaits launch by Slurm scheduler
Feb 28 11:23:58 UTC 2026 running job 14528727 is running
Feb 28 11:50:47 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-14528727.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-aarch64-nvidia-grace-17722790770.tar.gzsize: 251 MiB (263315138 bytes)
entries: 1752
modules under 2025.06/software/linux/aarch64/nvidia/grace/modules/all
Bazel/6.5.0-GCCcore-13.3.0-Java-11.lua
dill/0.3.9-GCCcore-13.3.0.lua
flatbuffers/24.3.25-GCCcore-13.3.0.lua
flatbuffers-python/24.3.25-GCCcore-13.3.0.lua
h5py/3.12.1-foss-2024a.lua
Java/11.0.27.lua
Java/.modulerc.lua
JsonCpp/1.9.5-GCCcore-13.3.0.lua
ml_dtypes/0.5.0-gfbf-2024a.lua
mpi4py/4.0.1-gompi-2024a.lua
nsync/1.29.2-GCCcore-13.3.0.lua
Zip/3.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/aarch64/nvidia/grace/software
Bazel/6.5.0-GCCcore-13.3.0-Java-11
dill/0.3.9-GCCcore-13.3.0
flatbuffers/24.3.25-GCCcore-13.3.0
flatbuffers-python/24.3.25-GCCcore-13.3.0
h5py/3.12.1-foss-2024a
Java/11.0.27
JsonCpp/1.9.5-GCCcore-13.3.0
ml_dtypes/0.5.0-gfbf-2024a
mpi4py/4.0.1-gompi-2024a
nsync/1.29.2-GCCcore-13.3.0
Zip/3.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/aarch64/nvidia/grace/reprod
Bazel/6.5.0-GCCcore-13.3.0-Java-11/20260228_113830UTC
dill/0.3.9-GCCcore-13.3.0/20260228_112630UTC
flatbuffers/24.3.25-GCCcore-13.3.0/20260228_113405UTC
flatbuffers-python/24.3.25-GCCcore-13.3.0/20260228_113835UTC
h5py/3.12.1-foss-2024a/20260228_113258UTC
Java/11.0.27/20260228_113430UTC
JsonCpp/1.9.5-GCCcore-13.3.0/20260228_113848UTC
ml_dtypes/0.5.0-gfbf-2024a/20260228_113943UTC
mpi4py/4.0.1-gompi-2024a/20260228_112959UTC
nsync/1.29.2-GCCcore-13.3.0/20260228_113950UTC
Zip/3.0-GCCcore-13.3.0/20260228_112551UTC
other under 2025.06/software/linux/aarch64/nvidia/grace
2025.06/init/easybuild/eb_hooks.py
Feb 28 11:50:47 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:aarch64-nvidia-grace+default
P: latency: 2.56 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:aarch64-nvidia-grace+default
P: latency: 6.31 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:aarch64-nvidia-grace+default
P: latency: 0.26 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:aarch64-nvidia-grace+default
P: bandwidth: 18677.35 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-14528727.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

TopRichard commented Feb 28, 2026

The aarch64/nvidia/grace build is failing with:

bazel-out/aarch64-opt/bin/external/KleidiAI/kai/ukernels/matmul/_objs/clamp_f16_f16_f16p16x1biasf16_1x16x8_neon_mla/kai_matmul_cl
amp_f16_f16_f16p16x1biasf16_1x16x8_neon_mla.pic.o)
# Configuration: f3a3bdee2605ed9cb5f60d801191d7d2ad5fce806555f7ee6fcf9107e4a32e53
# Execution platform: @local_execution_config_platform//:platform
cc1: error: switch '-mcpu=neoverse-v2+crc+sve2-sm4+sve2-aes+sve2-sha3+norng+nossbs+nopauth' conflicts with '-march=armv8.2-a+fp16' s
witch [-Werror]
cc1: all warnings being treated as errors

@TopRichard
Copy link
Collaborator Author

bot: build repo:eessi.io-2025.06-software instance:eessi-bot-mc-aws for:arch=aarch64/generic

@eessi-bot-aws
Copy link

eessi-bot-aws bot commented Feb 28, 2026

New job on instance eessi-bot-mc-aws for repository eessi.io-2025.06-software
Building on: generic
Building for: aarch64/generic
Job dir: /project/def-users/SHARED/jobs/2026.02/pr_171/135272

date job status comment
Feb 28 12:04:39 UTC 2026 submitted job id 135272 awaits release by job manager
Feb 28 12:05:32 UTC 2026 released job awaits launch by Slurm scheduler
Feb 28 12:06:35 UTC 2026 running job 135272 is running
Feb 28 12:42:58 UTC 2026 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-135272.out
✅ no message matching FATAL:
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.* created!
Artefacts
eessi-2025.06-software-linux-aarch64-generic-17722822780.tar.zstsize: 238 MiB (250398238 bytes)
entries: 1752
modules under 2025.06/software/linux/aarch64/generic/modules/all
Bazel/6.5.0-GCCcore-13.3.0-Java-11.lua
dill/0.3.9-GCCcore-13.3.0.lua
flatbuffers/24.3.25-GCCcore-13.3.0.lua
flatbuffers-python/24.3.25-GCCcore-13.3.0.lua
h5py/3.12.1-foss-2024a.lua
Java/11.0.27.lua
Java/.modulerc.lua
JsonCpp/1.9.5-GCCcore-13.3.0.lua
ml_dtypes/0.5.0-gfbf-2024a.lua
mpi4py/4.0.1-gompi-2024a.lua
nsync/1.29.2-GCCcore-13.3.0.lua
Zip/3.0-GCCcore-13.3.0.lua
software under 2025.06/software/linux/aarch64/generic/software
Bazel/6.5.0-GCCcore-13.3.0-Java-11
dill/0.3.9-GCCcore-13.3.0
flatbuffers/24.3.25-GCCcore-13.3.0
flatbuffers-python/24.3.25-GCCcore-13.3.0
h5py/3.12.1-foss-2024a
Java/11.0.27
JsonCpp/1.9.5-GCCcore-13.3.0
ml_dtypes/0.5.0-gfbf-2024a
mpi4py/4.0.1-gompi-2024a
nsync/1.29.2-GCCcore-13.3.0
Zip/3.0-GCCcore-13.3.0
reprod directories under 2025.06/software/linux/aarch64/generic/reprod
Bazel/6.5.0-GCCcore-13.3.0-Java-11/20260228_122849UTC
dill/0.3.9-GCCcore-13.3.0/20260228_120647UTC
flatbuffers/24.3.25-GCCcore-13.3.0/20260228_122016UTC
flatbuffers-python/24.3.25-GCCcore-13.3.0/20260228_122859UTC
h5py/3.12.1-foss-2024a/20260228_121914UTC
Java/11.0.27/20260228_122104UTC
JsonCpp/1.9.5-GCCcore-13.3.0/20260228_122924UTC
ml_dtypes/0.5.0-gfbf-2024a/20260228_123107UTC
mpi4py/4.0.1-gompi-2024a/20260228_121355UTC
nsync/1.29.2-GCCcore-13.3.0/20260228_123120UTC
Zip/3.0-GCCcore-13.3.0/20260228_120634UTC
other under 2025.06/software/linux/aarch64/generic
2025.06/init/easybuild/eb_hooks.py
Feb 28 12:42:58 UTC 2026 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ OK ] (1/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_allreduce %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /e4bf9965 @BotBuildTests:aarch64-generic+default
P: latency: 1.96 us (r:0, l:None, u:None)
[ OK ] (2/4) EESSI_OSU_coll %benchmark_info=mpi.collective.osu_alltoall %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node %device_type=cpu /3da4890b @BotBuildTests:aarch64-generic+default
P: latency: 5.52 us (r:0, l:None, u:None)
[ OK ] (3/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_latency %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /3255009a @BotBuildTests:aarch64-generic+default
P: latency: 0.29 us (r:0, l:None, u:None)
[ OK ] (4/4) EESSI_OSU_pt2pt_CPU %benchmark_info=mpi.pt2pt.osu_bw %module_name=OSU-Micro-Benchmarks/7.5-gompi-2025a %scale=1_node /59f4b331 @BotBuildTests:aarch64-generic+default
P: bandwidth: 15557.24 MB/s (r:0, l:None, u:None)
[ PASSED ] Ran 4/4 test case(s) from 4 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-135272.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@TopRichard
Copy link
Collaborator Author

The aarch64/generic build fails with:

bazel-out/aarch64-opt/bin/external/KleidiAI/kai/ukernels/matmul/_objs/clamp_f16_f16_f16p16x1biasf16_1x16x8_neon_mla/kai_matmul_c
lamp_f16_f16_f16p16x1biasf16_1x16x8_neon_mla.pic.o)
# Configuration: 5b18db14864c9ea13c32823efcbe155e2697e8548c4b61ecfacbf10dac5ae98f
# Execution platform: @local_execution_config_platform//:platform
cc1: error: switch '-mcpu=generic' conflicts with '-march=armv8.2-a+fp16' switch [-Werror]
cc1: all warnings being treated as errors

@TopRichard
Copy link
Collaborator Author

TopRichard commented Feb 28, 2026

The aarch64 build failure is related to the following code block in the easyblock:

# when building on Arm 64-bit we can't just use --copt=-mcpu=native (or likewise for any -mcpu=...),
# because it breaks the build of XNNPACK;
# see also https://github.com/easybuilders/easybuild-easyconfigs/issues/18899
        if get_cpu_architecture() == AARCH64:
            regex_subs = [
                # use --per_file_copt instead of --copt to selectively use -mcpu=native (not for XNNPACK),
                # the leading '-' ensures that -mcpu=native is *not* used when building XNNPACK;
                # see https://github.com/google/XNNPACK/issues/5566 + https://bazel.build/docs/user-manual#per-file-copt
                ('--copt=-mcpu=', '--per_file_copt=-.*XNNPACK/.*@-mcpu='),
            ]
            apply_regex_substitutions(tf_conf_bazelrc, regex_subs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant