Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions bazel/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,8 @@ config_setting(
name = "disable_tcmalloc",
values = {"define": "tcmalloc=disabled"},
)

config_setting(
name = "disable_signal_trace",
values = {"define": "signal_trace=disabled"},
)
17 changes: 12 additions & 5 deletions bazel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,11 +93,11 @@ bazel test //test/common/http:async_client_impl_test --strategy=TestRunner=stand
```
# Stack trace symbol resolution

Envoy can produce backtraces on demand or from assertions and other activity.
The stack traces written in the log or to stderr contain addresses rather than
resolved symbols. The `tools/stack_decode.py` script exists to process the output
and do symbol resolution to make the stack traces useful. Any log lines not
relevant to the backtrace capability are passed through the script unchanged
Envoy can produce backtraces on demand and from assertions and other fatal
actions like segfaults. The stack traces written in the log or to stderr contain
addresses rather than resolved symbols. The `tools/stack_decode.py` script exists
to process the output and do symbol resolution to make the stack traces useful. Any
log lines not relevant to the backtrace capability are passed through the script unchanged
(it acts like a filter).

The script runs in one of two modes. If passed no arguments it anticipates
Expand All @@ -116,6 +116,13 @@ bazel test -c dbg //test/server:backtrace_test
You will need to use either a `dbg` build type or the `--define
debug_symbols=yes` option to get symbol information in the binaries.

By default main.cc will install signal handlers to print backtraces at the
location where a fatal signal occurred. The signal handler will re-raise the
fatal signal with the default handler so a core file will still be dumped after
the stack trace is logged. To inhibit this behavior use
`--define=signal_trace=disabled` on the Bazel command line. No signal handlers will
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to set this in the tools/bazel.rc for the ASAN runs. I'm not sure where ASAN installs its signal handlers (based on your experiences with the tests, it seems after this), but this might be safer in general to avoid conflict.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done now

be installed.

# Running a single Bazel test under GDB

```
Expand Down
3 changes: 3 additions & 0 deletions bazel/envoy_build_system.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ def envoy_copts(repository, test = False):
}) + select({
repository + "//bazel:disable_tcmalloc": [],
"//conditions:default": ["-DTCMALLOC"],
}) + select({
repository + "//bazel:disable_signal_trace": [],
"//conditions:default": ["-DENVOY_HANDLE_SIGNALS"],
})

# References to Envoy external dependencies should be wrapped with this function.
Expand Down
16 changes: 15 additions & 1 deletion source/exe/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,10 @@ envoy_cc_library(
deps = [
":envoy_common_lib",
":hot_restart_lib",
],
] + select({
"//bazel:disable_signal_trace": [],
"//conditions:default": [":sigaction_lib"],
}),
)

envoy_cc_library(
Expand All @@ -72,3 +75,14 @@ envoy_cc_library(
"//source/common/stats:stats_lib",
],
)

envoy_cc_library(
name = "sigaction_lib",
srcs = ["signal_action.cc"],
hdrs = ["signal_action.h"],
deps = [
"//source/common/common:assert_lib",
"//source/common/common:non_copyable",
"//source/server:backtrace_lib",
],
)
8 changes: 8 additions & 0 deletions source/exe/main.cc
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@

#include "exe/hot_restart.h"

#ifdef ENVOY_HANDLE_SIGNALS
#include "exe/signal_action.h"
#endif

#include "server/drain_manager_impl.h"
#include "server/options_impl.h"
#include "server/server.h"
Expand All @@ -34,6 +38,10 @@ class ProdComponentFactory : public ComponentFactory {
} // Server

int main(int argc, char** argv) {
#ifdef ENVOY_HANDLE_SIGNALS
// Enabled by default. Control with "bazel --define=signal_trace=disabled"
SignalAction handle_sigs;
#endif
ares_library_init(ARES_LIB_INIT_ALL);
Event::Libevent::Global::initialize();
OptionsImpl options(argc, argv, Server::SharedMemory::version(), spdlog::level::warn);
Expand Down
95 changes: 95 additions & 0 deletions source/exe/signal_action.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
#include "exe/signal_action.h"

#include <signal.h>
#include <sys/mman.h>

#include "common/common/assert.h"

constexpr int SignalAction::FATAL_SIGS[];

void SignalAction::sigHandler(int sig, siginfo_t* info, void* context) {
void* error_pc = 0;

const ucontext_t* ucontext = reinterpret_cast<const ucontext_t*>(context);
if (ucontext != nullptr) {
#ifdef REG_RIP
// x86_64
error_pc = reinterpret_cast<void*>(ucontext->uc_mcontext.gregs[REG_RIP]);
#else
#warning "Please enable and test PC retrieval code for your arch in signal_action.cc"
// x86 Classic: reinterpret_cast<void*>(ucontext->uc_mcontext.gregs[REG_EIP]);
// ARM: reinterpret_cast<void*>(ucontext->uc_mcontext.arm_pc);
// PPC: reinterpret_cast<void*>(ucontext->uc_mcontext.regs->nip);
#endif
}

BackwardsTrace tracer;
tracer.logFault(strsignal(sig), info->si_addr);
if (error_pc != 0) {
tracer.captureFrom(error_pc);
} else {
tracer.capture();
}
tracer.logTrace();

signal(sig, SIG_DFL);
raise(sig);
}

void SignalAction::installSigHandlers() {
stack_t stack;
stack.ss_sp = altstack_ + guard_size_; // Guard page at one end ...
stack.ss_size = altstack_size_; // ... guard page at the other
stack.ss_flags = 0;

RELEASE_ASSERT(sigaltstack(&stack, nullptr) == 0);

for (const auto& sig : FATAL_SIGS) {
struct sigaction saction;
std::memset(&saction, 0, sizeof(saction));
sigemptyset(&saction.sa_mask);
saction.sa_flags = (SA_SIGINFO | SA_ONSTACK | SA_RESETHAND | SA_NODEFER);
saction.sa_sigaction = sigHandler;
RELEASE_ASSERT(sigaction(sig, &saction, nullptr) == 0);
}
}

void SignalAction::removeSigHandlers() {
for (const auto& sig : FATAL_SIGS) {
signal(sig, SIG_DFL);
}
}

void SignalAction::mapAndProtectStackMemory() {
// Per docs MAP_STACK doesn't actually do anything today but provides a
// library hint that might be used in the future.
altstack_ = static_cast<char*>(mmap(nullptr, mapSizeWithGuards(), PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0));
RELEASE_ASSERT(altstack_);
RELEASE_ASSERT(mprotect(altstack_, guard_size_, PROT_NONE) == 0);
RELEASE_ASSERT(mprotect(altstack_ + guard_size_ + altstack_size_, guard_size_, PROT_NONE) == 0);
}

void SignalAction::unmapStackMemory() { munmap(altstack_, mapSizeWithGuards()); }

void SignalAction::doGoodAccessForTest() {
volatile char* altaltstack = altstack_;
for (size_t i = 0; i < altstack_size_; ++i) {
*(altaltstack + guard_size_ + i) = 42;
}
for (size_t i = 0; i < altstack_size_; ++i) {
ASSERT(*(altaltstack + guard_size_ + i) == 42);
}
}

void SignalAction::tryEvilAccessForTest(bool end) {
volatile char* altaltstack = altstack_;
if (end) {
// One byte past the valid region
// http://oeis.org/A001969
*(altaltstack + guard_size_ + altstack_size_) = 43;
} else {
// One byte before the valid region
*(altaltstack + guard_size_ - 1) = 43;
}
}
117 changes: 117 additions & 0 deletions source/exe/signal_action.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
#pragma once

#include <signal.h>
#include <unistd.h>

#include "common/common/non_copyable.h"

#include "server/backtrace.h"

/**
* This class installs signal handlers for fatal signal types.
*
* These signals are handled:
* SIGABRT
* SIGBUS
* SIGFPE
* SIGILL
* SIGSEGV
*
* Upon intercepting the signal the following actions are taken:
*
* A Backtrace is printed from the address the signal was encountered at, if
* it is possible to retrieve.
*
* The signal handler is reset to the default handler (which is expected to
* terminate the process).
*
* The signal is raised again (which ultimately kills the process)
*
* The signal handler must run on an alternative stack so that we can do the
* stack unwind on the original stack. Memory is allocated for this purpose when
* this object is constructed. When this object goes out of scope the memory
* used for the alternate signal stack is destroyed and the default signal handler
* is restored.
*
* NOTE: Existing non-default signal handlers are overridden and will not be
* restored. If this behavior is ever required it can be implemented.
*
* It is recommended that this object be instantiated at the highest possible
* scope, eg, in main(). This enables fatal signal handling for almost all code
* executed.
*/
class SignalAction : NonCopyable {
public:
SignalAction()
: guard_size_(sysconf(_SC_PAGE_SIZE)), altstack_size_(guard_size_ * 4), altstack_(nullptr) {
mapAndProtectStackMemory();
installSigHandlers();
}
~SignalAction() {
removeSigHandlers();
unmapStackMemory();
}
/**
* Helpers for testing guarded stack memory
*/
void doGoodAccessForTest();
void tryEvilAccessForTest(bool end);

private:
/**
* Allocate this many bytes on each side of the area used for alt stack.
*
* Set to system page size.
*
* The memory will be protected from read and write.
*/
const size_t guard_size_;
/**
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: we don't need Doxygen documentation for internal implementation variables, you can use regular C++ comments.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason NOT to use Doxygen formatted comments here? Doxygen can still produce useful documentation for private member vars depending on how it is configured to run. I've found this information to be useful in past projects.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None other than inconsistency - if we don't have this documented as a convention across the code base in our style guide, then any Doxygen documentation on internal implementation details is a bit special snowflake. For example, unless we do have widespread agreement to use Doxygen for all documents, I plan on continuing to use // comments for internal comments.

* Use this many bytes for the alternate signal handling stack.
*
* Initialized as a multiple of page size (although sigaltstack will
* do alignment as needed).
*
* Additionally, two guard pages will be allocated to bookend the usable area.
*/
const size_t altstack_size_;
/**
* This constant array defines the signals we will insert handlers for.
*
* Essentially this is the list of signals that would cause a core dump.
*/
static constexpr int FATAL_SIGS[] = {SIGABRT, SIGBUS, SIGFPE, SIGILL, SIGSEGV};
/**
* Return the memory size we actually map including two guard pages.
*/
size_t mapSizeWithGuards() const { return altstack_size_ + guard_size_ * 2; }
/**
* The actual signal handler function with prototype matching signal.h
*/
static void sigHandler(int sig, siginfo_t* info, void* context);
/**
* Install all signal handlers and setup signal handling stack.
*/
void installSigHandlers();
/**
* Remove all signal handlers.
*
* Must be executed before the alt stack memory goes away.
*
* Signal handlers will be reset to the default, NOT back to any signal
* handler existing before InstallSigHandlers().
*/
void removeSigHandlers();
/**
* Use mmap to map anonymous memory for the alternative stack.
*
* GUARD_SIZE on either end of the memory will be marked PROT_NONE, protected
* from all access.
*/
void mapAndProtectStackMemory();
/**
* Unmap alternative stack memory.
*/
void unmapStackMemory();
char* altstack_;
};
Loading