Remove static global context_handler_ variable#322
Conversation
Codecov Report
@@ Coverage Diff @@
## master #322 +/- ##
==========================================
+ Coverage 94.62% 94.70% +0.08%
==========================================
Files 148 148
Lines 6655 6729 +74
==========================================
+ Hits 6297 6373 +76
+ Misses 358 356 -2
|
| * | ||
| * @param storage a custom runtime context storage | ||
| */ | ||
| static void SetRuntimeContextStorage(nostd::shared_ptr<RuntimeContextStorage> storage) noexcept |
There was a problem hiding this comment.
Is this method typically gonna be the very first method called before any telemetry is emitted by the process?
There was a problem hiding this comment.
Hard to say. I think it is, but I'm not sure about all the different scenarios.
There was a problem hiding this comment.
Is this method typically gonna be the very first method called before any telemetry is emitted by the process?
I documented that this method has to be called before any spans are created.
| private: | ||
| static nostd::shared_ptr<RuntimeContextStorage> GetRuntimeContextStorage() noexcept | ||
| { | ||
| while (GetLock().test_and_set(std::memory_order_acquire)) |
There was a problem hiding this comment.
I'm trying to understand the purpose of the lock here:
-
line 178 -
&GetStorage()is already atomic, since static at line 180 ismagic staticakafunction-local static. Since C++11, the initialization of magic statics is guaranteed to be thread safe... -
whereas
storageat line 172 is on stack...
Could you please clarify - what is the race we are trying to address? Is this a race between getting storage for the first time vs. SetRuntimeContextStorage potentially being called at the same time?
I was thinking about slightly less elegant but shorter solution.. It won't require a lock if we declare that the context storage customization MUST be done before any other telemetry API call:
https://github.com/maxgolov/opentelemetry-cpp/blob/4004c2d1ebc6bd9b1e55eaf9310726fc939f9de5/api/include/opentelemetry/context/runtime_context.h#L28
(sorry, I didn't refactor it much.. it's somewhat similar to previous implementation - same method names as before... I did this in my fork to alleviate the issue).
I like your solution overall. Just have doubts if we really need the locking here.
There was a problem hiding this comment.
The race is between GetRuntimeContextStorage and SetRuntimeContextStorage. Without the lock, we might partially overwrite the shared_ptr, thus scrambling up the counter and causing memory corruption.
I went the safer way here. But I'm fine with adding a requirement that context storage customization MUST be done before any other telemetry API call. Then we can remove the lock and win a tiny bit of performance.
There was a problem hiding this comment.
Agree. there is race condition and possible corruption if GetRuntimeContextStorage and SetRuntimeContextStorage is called simultaneously as line 163 is not atomic operation. Although this is tiny bit of performance overhead, it can become significant if GetRuntimeContextStorage get's called for each span processing. Adding a requirement for context storage initialization before telemetry api calls would be good in that case.
There was a problem hiding this comment.
I removed the locking around GetRuntimeContextStorage and SetRuntimeContextStorage. I added a note to SetRuntimeContextStorage that the behavior is undefined when it it's called after any spans have been created.
| */ | ||
| static void SetRuntimeContextStorage(nostd::shared_ptr<RuntimeContextStorage> storage) noexcept | ||
| { | ||
| while (GetLock().test_and_set(std::memory_order_acquire)) |
There was a problem hiding this comment.
Wrap GetLock().test_and_set() and GetLock().clear() into an object ctor and dtor to make sure clear() is always called here?
There was a problem hiding this comment.
Good point. I'll wait for the result of the discussion whether we'll need this lock at all. If so, I can wrap the calls in some kind of lock guard.
There was a problem hiding this comment.
I removed the locks altogether.
| // Pops the top Context off the stack and returns it. | ||
| Context Pop() noexcept | ||
| { | ||
| if (size_ <= 0) |
There was a problem hiding this comment.
size_t is unsigned, so just check size_ == 0 here?
| } | ||
| int index = size_ - 1; | ||
| size_--; | ||
| return base_[index]; |
There was a problem hiding this comment.
Alternate to return base_[--size_];?
There was a problem hiding this comment.
If size_ is of size_t type, then should we check before we decrement? (or is it going to be too much of a paranoid check?)
There was a problem hiding this comment.
I simplified the code. The check is already there (a few lines above).
|
|
||
| static RuntimeContextStorage *GetDefaultStorage() noexcept | ||
| { | ||
| return new ThreadLocalContextStorage(); |
There was a problem hiding this comment.
Wondering whether it is necessary to new ThreadLocalContextStorage() at every call, or just keep one global instance?
There was a problem hiding this comment.
That call is used to initialize the one global instance. It's only called once to initialize a static variable.
f1024a0 to
00cecbb
Compare
|
I removed all locks around the This is ready for another reviewround @open-telemetry/cpp-approvers. |
5448bd8 to
934322b
Compare
…what_you_use-0.x Update dependency depend_on_what_you_use to v0.9.0
The main intention of this PR is to remove the static global
context_handler_variable, which caused linking problems in several instances, as this would imply a need to have thecontext_handler_symbol present in the translation units linked, which conflicts with the header-only API approach.To fix this, I added a singleton approach similar to what we have in place with the
TracerProvider.While working on that, I always cleaned things up a bit and I separated the
RuntimeContextintoRuntimeContextand aRuntimeContextStorage. TheRuntimeContextStorageis the part that needs to be implemented for providing a custom context management strategy. I think having this separation in place reduces complexity.Closes #321.