I'm using a Skull Canyon NUC box with Iris Pro Graphics 580. Most of the examples run ok on this, but the fig_15_3 example hangs with current matrixSize=128 setting. If I lower matrixSize to 96, it executes very slowly.. over 6 secs per iteration. At matrixSize=100 it seeming stalls after a couple of iterations. I modified the code to add both async and queue exception catch, but no error is caught. I'll attach my code, the stack backtraces and the system monitor showing hang after a couple of iterations when matrixSize=100. Looks like CPU goes to 100% and stays there. Only 3 thread running. This is the single task matrix multiplication. The parallel versions in the following examples work ok on gpu. All examples work ok on cpu.


fig_15_3_single_task_matrix_multiplication_mod.zip
I'm using a Skull Canyon NUC box with Iris Pro Graphics 580. Most of the examples run ok on this, but the fig_15_3 example hangs with current matrixSize=128 setting. If I lower matrixSize to 96, it executes very slowly.. over 6 secs per iteration. At matrixSize=100 it seeming stalls after a couple of iterations. I modified the code to add both async and queue exception catch, but no error is caught. I'll attach my code, the stack backtraces and the system monitor showing hang after a couple of iterations when matrixSize=100. Looks like CPU goes to 100% and stays there. Only 3 thread running. This is the single task matrix multiplication. The parallel versions in the following examples work ok on gpu. All examples work ok on cpu.
fig_15_3_single_task_matrix_multiplication_mod.zip