Skip to content

Petsc overrides Julia signal handler #228

@vchuravy

Description

@vchuravy

Julia multithreaded uses signals for stop-the-world purposes, loading PETSc.jl in an multi-threaded Julia leads to failures like:

Thread 1 "julia" received signal SIGSEGV, Segmentation fault.
ijl_gc_safepoint () at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/jlapi.c:786
⚠ warning: 786 /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/jlapi.c: File o directory non esistente
(gdb) bt
#0  ijl_gc_safepoint () at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/jlapi.c:786
#1  0x00007ffff72b387c in ijl_task_get_next (trypoptask=<optimized out>, q=<optimized out>, checkempty=<optimized out>)
    at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/scheduler.c:459
#2  0x00007fffd8717e1f in julia_poptask_5618 () at task.jl:1216
#3  0x00007fffd7e95200 in julia_wait_5555 () at task.jl:1228
#4  0x00007fffd6fd23c0 in julia_#wait#398_6619 () at condition.jl:141
#5  0x00007fffb9567929 in julia_wait_readnb_14431 (x=..., nb=1) at stream.jl:392
#6  0x00007fffb96f21c9 in julia_eof_14434 (s=...) at stream.jl:106
#7  0x00007fffb907c89f in jfptr_eof_14435 () from /home/fferrazzini/.julia/juliaup/julia-1.12.5+0.x64.linux.gnu/share/julia/compiled/v1.12/REPL/u0gqU_ZarBC.so
#8  0x00007fffb9563f2c in julia_eof_14767 (io=...) at io.jl:472
#9  0x00007fffb93499ff in julia_#prompt!##2_18239 ()
    at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/usr/share/julia/stdlib/v1.12/REPL/src/LineEdit.jl:2948
#10 0x00007fffb906d5bc in jfptr_YY.promptNOT.YY.YY.2_18240.1 ()
   from /home/fferrazzini/.julia/juliaup/julia-1.12.5+0.x64.linux.gnu/share/julia/compiled/v1.12/REPL/u0gqU_ZarBC.so
#11 0x00007ffff727bb8a in jl_apply (nargs=1, args=0x7fffd0e5f010) at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/julia.h:2391
#12 start_task () at /cache/build/builder-amdci5-3/julialang/julia-release-1-dot-12/src/task.c:1252

Running under gdb and breaking on sigaction we can see that PETSc installs it's own signal handlers

Thread 1 "julia" hit Breakpoint 1, __GI___sigaction (sig=7,
    act=act@entry=0x7fffffff6a20, oact=oact@entry=0x7fffffff6ac0)
    at sigaction.c:27
27    {
#0  __GI___sigaction (sig=7, act=act@entry=0x7fffffff6a20,
    oact=oact@entry=0x7fffffff6ac0) at sigaction.c:27
#1  0x00007ffff7dcc141 in __bsd_signal (sig=<optimized out>,
    handler=<optimized out>) at ../sysdeps/posix/signal.c:45
#2  0x00007fff01b063b2 in PetscPushSignalHandler ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#3  0x00007fff01b4d49a in PetscOptionsCheckInitial_Private ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#4  0x00007fff01b6c8d1 in PetscInitialize_Common ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#5  0x00007fff01b6ddfc in PetscInitialize.part.0 ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#6  0x00007fff01b6df98 in PetscInitializeNoArguments ()
   from /home/vchuravy/.julia/artifacts/fd874acf28ad612d0af6811612bf7e52c25cc815/lib/petsc/double_real_Int64/lib/libpetsc_double_real_Int64.so.3.22
#7  0x00007fff4a510bb9 in ?? ()
#8  0x0000000000000000 in ?? ()

https://petsc.org/release/manualpages/Sys/PetscPushSignalHandler/

There is no way to return to a signal handler that was set directly by the user with the UNIX signal handler API or by the loader. That information is lost with the first call to [PetscPushSignalHandler](https://petsc.org/release/manualpages/Sys/PetscPushSignalHandler/)()

Looking at the source code there seems to be:

  PetscCall(PetscOptionsGetBool(NULL, NULL, "-no_signal_handler", &flg1, NULL));
  if (!flg1) PetscCall(PetscPushSignalHandler(PetscSignalHandlerDefault, NULL));

So perhaps we can simply set that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions