Conversation
|
Can we keep all your changes in a |
brentyates-swx
pushed a commit
that referenced
this pull request
Feb 24, 2023
Deferring oo_exit_hook() fixes a stuck C++ application:
#0 0x00007fd2d7afb87b in ioctl () from /lib64/libc.so.6
#1 0x00007fd2d80c0621 in oo_resource_op (cmd=3221510722, io=0x7ffd15be696c, fp=<optimized out>) at /home/iteterev/lab/onload_internal/src/include/onload/mmap.h:104
#2 __oo_eplock_lock (timeout=<synthetic pointer>, maybe_wedged=0, ni=0x20c8480) at /home/iteterev/lab/onload_internal/src/lib/transport/ip/eplock_slow.c:35
Xilinx-CNS#3 __ef_eplock_lock_slow (ni=ni@entry=0x20c8480, timeout=timeout@entry=-1, maybe_wedged=maybe_wedged@entry=0) at /home/iteterev/lab/onload_internal/src/lib/transport/ip/eplock_slow.c:72
Xilinx-CNS#4 0x00007fd2d80d7dbf in ef_eplock_lock (ni=0x20c8480) at /home/iteterev/lab/onload_internal/src/include/onload/eplock.h:61
Xilinx-CNS#5 __ci_netif_lock_count (stat=0x7fd2d5c5b62c, ni=0x20c8480) at /home/iteterev/lab/onload_internal/src/include/ci/internal/ip_shared_ops.h:79
Xilinx-CNS#6 ci_tcp_setsockopt (ep=ep@entry=0x20c8460, fd=6, level=level@entry=1, optname=optname@entry=9, optval=optval@entry=0x7ffd15be6acc, optlen=optlen@entry=4) at /home/iteterev/lab/onload_internal/src/lib/transport/ip/tcp_sockopts.c:580
Xilinx-CNS#7 0x00007fd2d8010da7 in citp_tcp_setsockopt (fdinfo=0x20c8420, level=1, optname=9, optval=0x7ffd15be6acc, optlen=4) at /home/iteterev/lab/onload_internal/src/lib/transport/unix/tcp_fd.c:1594
Xilinx-CNS#8 0x00007fd2d7fde088 in onload_setsockopt (fd=6, level=1, optname=9, optval=0x7ffd15be6acc, optlen=4) at /home/iteterev/lab/onload_internal/src/lib/transport/unix/sockcall_intercept.c:737
Xilinx-CNS#9 0x00007fd2d7dcb7dd in ?? ()
Xilinx-CNS#10 0x00007fd2d83392e0 in ?? () from /home/iteterev/lab/onload_internal/build/gnu_x86_64/lib/transport/unix/libcitransport0.so
Xilinx-CNS#11 0x000000000060102c in data_start ()
Xilinx-CNS#12 0x00007fd2d8339540 in ?? () from /home/iteterev/lab/onload_internal/build/gnu_x86_64/lib/transport/unix/libcitransport0.so
Xilinx-CNS#13 0x00000001d85426c0 in ?? ()
Xilinx-CNS#14 0x00007fd2d7fcbe08 in ?? ()
Xilinx-CNS#15 0x00007fd2d7a433c7 in __cxa_finalize () from /lib64/libc.so.6
Xilinx-CNS#16 0x00007fd2d7dcb757 in ?? ()
Xilinx-CNS#17 0x00007ffd15be6be0 in ?? ()
Xilinx-CNS#18 0x00007fd2d834f2a6 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
Here, _fini() is a function that calls all library destructors. The
problem is that _fini() decides to run the C++ library destructor
*after* Onload and makes it operate on an invalid Onload state.
The patch leverages the fact that Glibc sets up _fini() after running
the last library constructor, so by manually installing the exit handler
(instead of providing a library destructor), Onload wins the race with
_fini().
There's still an issue if the user library sets a custom exit handler
with atexit() or on_exit() and makes intercepted system calls from
there.
Tested:
* RHEL 7.9/glibc 2.17
* RHEL 8.2/glibc 2.28
* RHEL 9.1/glibc 2.34
Thanks-to: Richard Hughes <rhughes@xilinx.com>
Thanks-to: Siân James <sian.james@xilinx.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I tried my best to come up with a .clang-format file that sorta matched what they had but their style was all over the place.
There are still bugs I'm ironing out, but this PR gives the basic functionality the TCP/UDP stack communicating through DPDK.
TODO:
Disable xdp program loading completely so that I don't need to make any modification to the nic with ethtool
Figure out whats going on with the number of available mbufs running out
Socket Leak?