Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docker-wait-any]: Exit worker thread if main thread is expected to exit #12255

Merged
merged 1 commit into from
Oct 6, 2022

Conversation

saiarcot895
Copy link
Contributor

@saiarcot895 saiarcot895 commented Oct 3, 2022

Signed-off-by: Saikrishna Arcot sarcot@microsoft.com

Why I did it

There's an odd crash that intermittently happens during config reload or config load_minigraph. The swss container has a python script at /usr/bin/docker-wait-any that waits for either the swss, syncd, or teamd containers to exit (one container being monitored per thread). When a container exits, a signal is sent to the main thread to exit the python script, which then tells systemd to stop the container.

Snippet of the backtrace:

#0  0x00007f2a01296ce1 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f2a01280537 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f29ff3147ec in __gnu_cxx::__verbose_terminate_handler () at ../../../../src/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007f29ff31f966 in __cxxabiv1::__terminate (handler=<optimized out>) at ../../../../src/libstdc++-v3/libsupc++/eh_terminate.cc:48
#4  0x00007f29ff31f9d1 in std::terminate () at ../../../../src/libstdc++-v3/libsupc++/eh_terminate.cc:58
#5  0x00007f29ff31f3cc in __cxxabiv1::__gxx_personality_v0 (version=<optimized out>, actions=10, exception_class=0, ue_header=0x7f29fdc9cd70, context=<optimized out>)
    at ../../../../src/libstdc++-v3/libsupc++/eh_personality.cc:673
#6  0x00007f29ff2708a4 in _Unwind_ForcedUnwind_Phase2 (exc=0x7f29fdc9cd70, context=0x7f29fdc9b0c0, frames_p=0x7f29fdc9afc8) at ../../../src/libgcc/unwind.inc:182
#7  0x00007f29ff270f4e in _Unwind_ForcedUnwind (exc=0x7f29fdc9cd70, stop=<optimized out>, stop_argument=0x7f29fdc9bf10) at ../../../src/libgcc/unwind.inc:217
#8  0x00007f2a015e1c30 in __pthread_unwind () from /lib/x86_64-linux-gnu/libpthread.so.0
#9  0x00007f2a015d918c in pthread_exit () from /lib/x86_64-linux-gnu/libpthread.so.0
#10 0x0000000000645df5 in PyThread_exit_thread () at ../Python/thread_pthread.h:373
#11 0x00000000004262ae in take_gil (tstate=0x1144670) at ../Python/ceval_gil.h:224
#12 0x00000000005327b2 in PyEval_RestoreThread (tstate=tstate@entry=0x1144670) at ../Python/ceval.c:467
#13 0x00007f29ffb6ec1a in PyThreadStateGuard::~PyThreadStateGuard (this=<synthetic pointer>, __in_chrg=<optimized out>) at swsscommon_wrap.cpp:24543
#14 _wrap_SonicV2Connector_Native_connect (args=<optimized out>, kwargs=<optimized out>) at swsscommon_wrap.cpp:24545
#15 0x000000000053f350 in cfunction_call (func=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>, args=<optimized out>, kwargs=<optimized out>)
    at ../Objects/methodobject.c:539
#16 0x000000000051d89b in _PyObject_MakeTpCall (tstate=0x1144670, callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>, args=<optimized out>,
    nargs=<optimized out>, keywords=<optimized out>) at ../Objects/call.c:191
#17 0x00000000005175ba in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7f29fecb8e10,
    callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>, tstate=0x1144670) at ../Include/cpython/abstract.h:116
#18 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7f29fecb8e10, callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>,
    tstate=0x1144670) at ../Include/cpython/abstract.h:103
#19 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7f29fecb8e10, callable=<built-in method SonicV2Connector_Native_connect of module object at remote 0x7f29ffc97770>)
    at ../Include/cpython/abstract.h:127
#20 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1144670) at ../Python/ceval.c:5072
#21 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at ../Python/ceval.c:3487
#22 0x00000000005106ed in _PyEval_EvalFrame (throwflag=0,
    f=Frame 0x7f29fecb8c80, for file /usr/lib/python3/dist-packages/swsscommon/swsscommon.py, line 1651, in connect (self=<SonicV2Connector(this=<SwigPyObject at remote 0x7f29fecb7cf0>, STATE_DB='STATE_DB', APPL_DB='APPL_DB', GB_FLEX_COUNTER_DB='GB_FLEX_COUNTER_DB', APPL_STATE_DB='APPL_STATE_DB', ASIC_DB='ASIC_DB', CONFIG_DB='CONFIG_DB', COUNTERS_DB='COUNTERS_DB', LOGLEVEL_DB='LOGLEVEL_DB', GB_ASIC_DB='GB_ASIC_DB', GB_COUNTERS_DB='GB_COUNTERS_DB', PFC_WD_DB='PFC_WD_DB', FLEX_COUNTER_DB='FLEX_COUNTER_DB', RESTAPI_DB='RESTAPI_DB', SNMP_OVERLAY_DB='SNMP_OVERLAY_DB') at remote 0x7f2a0051d040>, db_name='STATE_DB', retry_on=False), tstate=0x1144670) at ../Include/internal/pycore_ceval.h:40

What's happening is that after the teamd container exits, the signal is sent to the main thread, but because there's no return or exit out of the while True loop that it's in, it waits for the teamd container to exit. Because the container isn't running, this is effectively a no-op, and execution moves on to the device_info.is_warm_restart_enabled(container_name) and device_info.is_fast_reboot_enabled() function calls (which call to C++ code). Meanwhile, the main thread has called sys.exit(0), and Python is bringing down all of its references/data structures, and (more importantly here) is telling the other threads to exit.

For the teamd thread, when it returns from calling the C++ function, the wrapper code generated by SWIG is destructing a C++ object that it has created (for the purposes of saving/restoring the Python thread state). It calls PyEval_RestoreThread() in the destructor, which sees that the thread is supposed to exit, and proceeds to call pthread_exit(). This is shown in frames 12 and 13 above.

pthread_exit() then calls a function that will unwind the stack, so that any cleanup or other handler functions can be called. This is so that there's a graceful exit to the thread. However, the way that unwinding works is that a special exception is called (abi::__forced_unwind or __cxxabiv1::__forced_unwind) that is expected to be propagated to the first frame. As the unwinder works frame-by-frame, one of these things happen for each frame on the stack:

  • If there's no cleanup handlers or exception handlers registered for that frame, then it just moves on.
  • If there's a cleanup handler registered, then that gets called.
  • If there's an exception handler registered (i.e. try/catch block), and there's a matching catch block for this exception, then that will get called.

For most frames, nothing probably happens. However, one of the frames on the stack here is a C++ destructor (which had called PyEval_RestoreThread() earlier). In C++11 and newer, C++ destructors are not allowed to have an exception get propagated outside of the destructor, and if they do, std::terminate() gets called. In other words, any exceptions that could be caused by functions that the destructor calls must be handled within the destructor, and must not be propagated up the stack. If the destructor specifies noexcept(false) to signify that exceptions could be propagated up, then maybe it's fine (I'm not entirely certain about this). Because the unwinder essentially uses a special exception to go up the stack, std::terminate gets called, which then results in a SIGABRT for the process. Because of this SIGABRT, systemd appears to treat the service as stopped, and doesn't call the ExecStop= command, which means the containers don't actually go down.

All of this is a timing issue; if it's unlucky enough that the thread exiting check is done around the call to C++ code, then a SIGABRT could happen. This, unfortunately, appears to be happening sufficiently often in some cases, as well as some forced cases (see below).

How I did it

A quick workaround is that if we know the main thread needs to exit, just return after sending the signal to the main thread, and don't continue execution. This at least tries to avoid it from getting into the problematic code path. However, it's still possible to get a SIGABRT because of the above, depending on thread/process timings (i.e. teamd exits, signals the main thread to exit, and then syncd exits, and syncd calls one of the two C++ functions, potentially hitting the issue).

A proper fix would likely be to make sure PyEval_RestoreThread() (and, in turn, pthread_exit() gets called from a regular C++ function, and not the destructor. The SWIG wrapper code generated with the -threads option does this, but there's still a gap there where it might get called from the destructor, so it's not immune. (Currently, the swsscommon wrapper code is not using the -threads option, and is manually adding support for multithreading.)

How to verify it

This was tested with the following Bash script. On my dev VM, at least, with this script, the core file was repro'ed in 1-2 iterations. With my fix, 90+ iterations were successfully done with no core file:

#!/bin/bash

set -euo pipefail

ITERATION=0

while [ -n "$(find "/var/core" -maxdepth 0 -type d -empty 2>/dev/null)" ]; do
        ITERATION=$(( $ITERATION + 1 ))
        echo "Starting iteration ${ITERATION}"
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        python3 /usr/bin/docker-wait-any -s swss -d syncd teamd &
        sleep 3
        config load_minigraph -y
        sleep 90
done

echo "Core file found on iteration ${ITERATION}!"

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205

Description for the changelog

Ensure to add label/tag for the feature raised.

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

There's an odd crash that intermittently happens after the teamd container
exits, and a signal is raised to the main thread to exit. This thread (watching
teamd) continues execution because it's in a `while True`. The subsequent wait
call on the teamd container very likely returns immediately, and it calls
`is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these
cases, sometimes, there is a crash in the transition from C code to Python code
(after the function gets executed).  Python sees that this thread got a signal
to exit, because the main thread is exiting, and tells pthread to exit the
thread.  However, during the stack unwinding, _something_ is telling the
unwinder to call `std::terminate`.  The reason is unknown.

This then results in a python3 SIGABRT, and systemd then doesn't call the stop
script to actually stop the container (possibly because the main process exited
with a SIGABRT, so it's a hard crash). This means that the container doesn't
actually get stopped or restarted, resulting in an inconsistent state
afterwards.

The workaround appears to be that if we know the main thread needs to exit,
just return here, and don't continue execution. This at least tries to avoid it
from getting into the problematic code path. However, it's still feasible to
get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals
the main thread to exit, and then syncd exits, and syncd calls one of the two C
functions, potentially hitting the issue).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
@saiarcot895
Copy link
Contributor Author

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@saiarcot895 saiarcot895 marked this pull request as ready for review October 6, 2022 01:13
@yxieca yxieca merged commit 9251d4b into sonic-net:master Oct 6, 2022
@saiarcot895 saiarcot895 deleted the fix-python-crash branch October 6, 2022 01:14
yxieca pushed a commit that referenced this pull request Oct 6, 2022
…xit (#12255)

There's an odd crash that intermittently happens after the teamd container
exits, and a signal is raised to the main thread to exit. This thread (watching
teamd) continues execution because it's in a `while True`. The subsequent wait
call on the teamd container very likely returns immediately, and it calls
`is_warm_restart_enabled` and `is_fast_reboot_enabled`. In either of these
cases, sometimes, there is a crash in the transition from C code to Python code
(after the function gets executed).  Python sees that this thread got a signal
to exit, because the main thread is exiting, and tells pthread to exit the
thread.  However, during the stack unwinding, _something_ is telling the
unwinder to call `std::terminate`.  The reason is unknown.

This then results in a python3 SIGABRT, and systemd then doesn't call the stop
script to actually stop the container (possibly because the main process exited
with a SIGABRT, so it's a hard crash). This means that the container doesn't
actually get stopped or restarted, resulting in an inconsistent state
afterwards.

The workaround appears to be that if we know the main thread needs to exit,
just return here, and don't continue execution. This at least tries to avoid it
from getting into the problematic code path. However, it's still feasible to
get a SIGABRT, depending on thread/process timings (i.e. teamd exits, signals
the main thread to exit, and then syncd exits, and syncd calls one of the two C
functions, potentially hitting the issue).

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants