MPAS hangs randomly (and sometimes not at all) when MPAS global print statements are set to true. The global print statements are config_print_global_minmax_vel, config_print_detailed_minmax_vel, config_print_global_minmax_sca. Additional model hangs have also been reported when setting these global print statements to false.
These print statements use the subroutine mpas_dmpar_maxattributes_real in /src/framework/mpas_dmpar.F, and the subroutine relies on MPI_Allreduce. When using (older) intel compilers, there are reports that MPI_Allreduce and MPI_Allgather can hang.
@SamuelTrahanNOAA recently debugged a hanging version of MPAS and found that one MPI task was stuck in a different part of the code than the other tasks.
Solutions:
- Set global config print statements to false. This may help, but not completely fix the issue.
- Use newer intel compilers if possible or gfortran. More testing will follow.
MPAS hangs randomly (and sometimes not at all) when MPAS global print statements are set to true. The global print statements are
config_print_global_minmax_vel,config_print_detailed_minmax_vel,config_print_global_minmax_sca. Additional model hangs have also been reported when setting these global print statements to false.These print statements use the subroutine
mpas_dmpar_maxattributes_realin /src/framework/mpas_dmpar.F, and the subroutine relies on MPI_Allreduce. When using (older) intel compilers, there are reports that MPI_Allreduce and MPI_Allgather can hang.@SamuelTrahanNOAA recently debugged a hanging version of MPAS and found that one MPI task was stuck in a different part of the code than the other tasks.
Solutions: