Adding the scope API and isolating our own interfaces#28
Conversation
palazzem
left a comment
There was a problem hiding this comment.
Left some comments about the API. We already run through most of the code together, so in the overall it's fine. Let's discuss about a couple of changes and we're good.
|
|
||
| internal Span(IDatadogTracer tracer, SpanContext parent, string operationName, string serviceName, DateTimeOffset? start) | ||
| { | ||
| // TODO:bertrand should we throw an exception if operationName is null or empty? |
There was a problem hiding this comment.
if we throw exceptions, we should handle it and prevent from bubbling the exception to users code.
There was a problem hiding this comment.
What should happen when a user creates a span with a null OperationName then? We create the span and let it be discarded by the backend?
| private readonly AsyncLocalScopeManager _scopeManager; | ||
| private bool _finishOnClose; | ||
|
|
||
| internal Scope(Scope parent, Span span, AsyncLocalScopeManager scopeManager, bool finishOnClose) |
| /// <summary> | ||
| /// Gets the active scope | ||
| /// </summary> | ||
| public Scope ActiveScope => _scopeManager.Active; |
There was a problem hiding this comment.
fine to keep the ScopeManager internal, while exposing his methods through the Tracer interface. We'll see where OpenTracing is going, but until then let's make our choices.
| /// <param name="startTime">An explicit start time for that span</param> | ||
| /// <param name="ignoreActiveScope">If set the span will not be a child of the currently active span</param> | ||
| /// <returns>The newly created span</returns> | ||
| public Span StartManual(string operationName, SpanContext parent = null, string serviceName = null, DateTimeOffset? startTime = null, bool ignoreActiveScope = false) |
There was a problem hiding this comment.
If we're using this shorthand, we should name it StartSpan. On the other hand, if we're using the builder pattern, the method should be called just Start. We had a conversation for the Python API and we ended making these choices: https://github.com/opentracing/opentracing-python/pull/63/files#r150893180
| return scope; | ||
| } | ||
|
|
||
| public void Close(Scope scope) |
There was a problem hiding this comment.
The ScopeManager interface should have just Active and Activate as per Java API: https://github.com/opentracing/opentracing-java/blob/master/opentracing-api/src/main/java/io/opentracing/ScopeManager.java#L26-L51
Is there a reason (or a shorthand) to have a public Close here? isn't responsibility of the Scope knowing how it has to be closed? as implementation detail, we inject the ScopeManager in the Scope, so we can do this action from there correct?
(Reference for Scope interface: https://github.com/opentracing/opentracing-java/blob/master/opentracing-api/src/main/java/io/opentracing/Scope.java#L27-L43)
There was a problem hiding this comment.
This is an internal class so none of these methods is actually public. But you're right, if it was public the Close method should indeed be internal.
|
|
||
| public ISpan SetTag(string key, string value) | ||
| { | ||
| // TODO:bertrand do we want this behavior on the Span object too ? |
There was a problem hiding this comment.
for now let's keep only this TODO. In another PR we can keep it consistent in our implementation too.
…5808) ## Summary of changes Prevent deadlock betwen signal-based profilers (walltime/manual cpu profilers) and non-signal based profilers (exception, contention....) ## Reason for change When an exception occurs, the thread can be interrupted by a signal-based profiler (walltime/manual cpu). It can be interrupted while holding the lock used to update the `dl-iterate-phdr` cache. ``` Thread 18 (LWP 995): #0 __syscall_cp_c (nr=202, u=140244538814536, v=128, w=-1, x=0, y=0, z=0) at ./arch/x86_64/syscall_arch.h:61 #1 0x00007f8dba343ccd in __futex4_cp (to=0x0, val=-1, op=128, addr=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at src/thread/__timedwait.c:24 #2 __timedwait_cp (addr=addr@entry=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, val=val@entry=-1, clk=clk@entry=0, at=at@entry=0x0, priv=priv@entry=128) at src/thread/__timedwait.c:52 #3 0x00007f8dba343d74 in __timedwait (addr=addr@entry=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, val=-1, clk=clk@entry=0, at=at@entry=0x0, priv=128) at src/thread/__timedwait.c:68 #4 0x00007f8dba3463e6 in __pthread_rwlock_timedrdlock (at=<optimized out>, rw=<optimized out>) at src/thread/pthread_rwlock_timedrdlock.c:18 #5 __pthread_rwlock_timedrdlock (rw=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, at=0x0) at src/thread/pthread_rwlock_timedrdlock.c:3 #6 0x00007f8d398f3ca8 in std::__glibcxx_rwlock_rdlock (__rwlock=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:73 #7 std::__shared_mutex_pthread::lock_shared (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:224 #8 std::shared_mutex::lock_shared (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:421 #9 std::shared_lock<std::shared_mutex>::shared_lock (this=0x7f4ca05a2ac0, __m=...) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:722 #10 LibrariesInfoCache::DlIteratePhdrImpl (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, callback=0x7f8d3997d900 <_Ux86_64_dwarf_callback>, data=0x7f4ca05a2b20) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LibrariesInfoCache.cpp:104 #11 0x00007f8d3997e4ee in _Ux86_64_dwarf_find_proc_info (as=0x7f8d39eb2a00 <local_addr_space>, ip=140246691112115, pi=0x7f4ca05a3170, need_unwind_info=1, arg=0x7f4ca05a3411) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gfind_proc_info-lsb.c:807 #12 0x00007f8d3997e690 in fetch_proc_info (c=0x7f4ca05a3018, ip=140246691112115) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gparser.c:473 #13 0x00007f8d3998113d in find_reg_state (sr=0x7f4ca05a2dc0, c=0x7f4ca05a3018) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gparser.c:1024 #14 _Ux86_64_dwarf_step (c=c@entry=0x7f4ca05a3018) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gparser.c:1069 #15 0x00007f8d3997d13a in _Ux86_64_step (cursor=0x7f4ca05a3018) at /project/obj/libunwind-prefix/src/libunwind/src/x86_64/Gstep.c:75 #16 0x00007f8d398f55c8 in LinuxStackFramesCollector::CollectStackManually (this=this@entry=0x7f8d392dc6d0, ctx=ctx@entry=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:288 #17 0x00007f8d398f53dc in LinuxStackFramesCollector::CollectCallStackCurrentThread (this=this@entry=0x7f8d392dc6d0, ctx=ctx@entry=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:227 #18 0x00007f8d398f4672 in LinuxStackFramesCollector::CollectStackSampleSignalHandler (signal=<optimized out>, info=<optimized out>, context=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:373 #19 0x00007f8d398fb871 in ProfilerSignalManager::CallCustomHandler (this=0x7f8d39eaf928 <ProfilerSignalManager::Get(int)::signalManagers+1944>, signal=10, info=0x7f4ca05a39b0, context=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/ProfilerSignalManager.cpp:197 #20 ProfilerSignalManager::SignalHandler (signal=10, info=0x7f4ca05a39b0, context=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/ProfilerSignalManager.cpp:188 #21 <signal handler called> #22 __pthread_rwlock_unlock (rw=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at src/thread/pthread_rwlock_unlock.c:5 #23 0x00007f8d398f3bf9 in std::__glibcxx_rwlock_unlock (__rwlock=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:77 #24 std::__shared_mutex_pthread::unlock (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:208 #25 std::shared_mutex::unlock (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:417 #26 std::unique_lock<std::shared_mutex>::unlock (this=0x7f4ca05a3e20) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/bits/unique_lock.h:194 #27 std::unique_lock<std::shared_mutex>::~unique_lock (this=0x7f4ca05a3e20) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/bits/unique_lock.h:103 #28 LibrariesInfoCache::UpdateCache (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LibrariesInfoCache.cpp:88 #29 0x00007f8d398f4e59 in LinuxStackFramesCollector::CollectStackSampleImplementation (this=0x7f8d3b91bc90, pThreadInfo=0x7f4ca06b9900, pHR=0x7f8d3a63c510, selfCollect=true) at /p--Type <RET> for more, q to quit, c to continue without paging-- roject/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:100 #30 0x00007f8d399637ba in StackFramesCollectorBase::CollectStackSample (this=0x7f8d3b91bc90, pThreadInfo=0x7f4ca06b9900, pHR=0x7f4ca05a3fdc) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native/StackFramesCollectorBase.cpp:185 #31 0x00007f8d3992acb9 in ExceptionsProvider::OnExceptionThrown (this=0x7f8d392a7160, thrownObjectId=139969739182080) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native/ExceptionsProvider.cpp:149 #32 0x00007f8d39917045 in CorProfilerCallback::ExceptionThrown (this=0x7f8d392c0d20, thrownObjectId=139969739182080) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native/CorProfilerCallback.cpp:1734 ``` ## Implementation details - move the call which updates the cache after acquiring the thread lock - call Update before sending signal ## Test coverage ## Other details <!-- Fixes #{issue} --> <!--⚠️ Note: where possible, please obtain 2 approvals prior to merging. Unless CODEOWNERS specifies otherwise, for external teams it is typically best to have one review from a team member, and one review from apm-dotnet. Trivial changes do not require 2 reviews. -->
This PR adds the notion of Scope to manage automatic context propagation in the tracer. It also extracts a Datadog specific tracer distinct from the OpenTracing APIs.
The datadog tracer is different from the OpenTracing APIs in several ways:
Things that this PR is missing: