Skip to content

Adding the scope API and isolating our own interfaces#28

Merged
palazzem merged 14 commits intomasterfrom
bmermet/extractinterfaces
Jan 18, 2018
Merged

Adding the scope API and isolating our own interfaces#28
palazzem merged 14 commits intomasterfrom
bmermet/extractinterfaces

Conversation

@bmermet
Copy link
Copy Markdown
Contributor

@bmermet bmermet commented Jan 12, 2018

This PR adds the notion of Scope to manage automatic context propagation in the tracer. It also extracts a Datadog specific tracer distinct from the OpenTracing APIs.

The datadog tracer is different from the OpenTracing APIs in several ways:

  • There is no SpanBuilder (it can be added on top of the tracer though)
  • Spans are created by the StartActive and StartManual methods of the tracer
  • Context propagation is managed with the Scope type

Things that this PR is missing:

  • Add a global tracer object

@bmermet bmermet requested a review from palazzem January 12, 2018 16:13
@palazzem palazzem added this to the 0.1.2 milestone Jan 15, 2018
@palazzem palazzem added type:enhancement Improvement to an existing feature area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) breaking-change labels Jan 15, 2018
Copy link
Copy Markdown
Contributor

@palazzem palazzem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments about the API. We already run through most of the code together, so in the overall it's fine. Let's discuss about a couple of changes and we're good.


internal Span(IDatadogTracer tracer, SpanContext parent, string operationName, string serviceName, DateTimeOffset? start)
{
// TODO:bertrand should we throw an exception if operationName is null or empty?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we throw exceptions, we should handle it and prevent from bubbling the exception to users code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should happen when a user creates a span with a null OperationName then? We create the span and let it be discarded by the backend?

private readonly AsyncLocalScopeManager _scopeManager;
private bool _finishOnClose;

internal Scope(Scope parent, Span span, AsyncLocalScopeManager scopeManager, bool finishOnClose)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

/// <summary>
/// Gets the active scope
/// </summary>
public Scope ActiveScope => _scopeManager.Active;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine to keep the ScopeManager internal, while exposing his methods through the Tracer interface. We'll see where OpenTracing is going, but until then let's make our choices.

/// <param name="startTime">An explicit start time for that span</param>
/// <param name="ignoreActiveScope">If set the span will not be a child of the currently active span</param>
/// <returns>The newly created span</returns>
public Span StartManual(string operationName, SpanContext parent = null, string serviceName = null, DateTimeOffset? startTime = null, bool ignoreActiveScope = false)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're using this shorthand, we should name it StartSpan. On the other hand, if we're using the builder pattern, the method should be called just Start. We had a conversation for the Python API and we ended making these choices: https://github.com/opentracing/opentracing-python/pull/63/files#r150893180

return scope;
}

public void Close(Scope scope)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ScopeManager interface should have just Active and Activate as per Java API: https://github.com/opentracing/opentracing-java/blob/master/opentracing-api/src/main/java/io/opentracing/ScopeManager.java#L26-L51

Is there a reason (or a shorthand) to have a public Close here? isn't responsibility of the Scope knowing how it has to be closed? as implementation detail, we inject the ScopeManager in the Scope, so we can do this action from there correct?

(Reference for Scope interface: https://github.com/opentracing/opentracing-java/blob/master/opentracing-api/src/main/java/io/opentracing/Scope.java#L27-L43)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an internal class so none of these methods is actually public. But you're right, if it was public the Close method should indeed be internal.


public ISpan SetTag(string key, string value)
{
// TODO:bertrand do we want this behavior on the Span object too ?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now let's keep only this TODO. In another PR we can keep it consistent in our implementation too.

@palazzem palazzem merged commit d8ff1e1 into master Jan 18, 2018
@palazzem palazzem deleted the bmermet/extractinterfaces branch January 22, 2018 13:00
macrogreg added a commit that referenced this pull request Aug 20, 2021
andrewlock pushed a commit that referenced this pull request Jul 26, 2024
…5808)

## Summary of changes

Prevent deadlock betwen signal-based profilers (walltime/manual cpu
profilers) and non-signal based profilers (exception, contention....)

## Reason for change

When an exception occurs, the thread can be interrupted by a
signal-based profiler (walltime/manual cpu). It can be interrupted while
holding the lock used to update the `dl-iterate-phdr` cache.

```
Thread 18 (LWP 995):
#0  __syscall_cp_c (nr=202, u=140244538814536, v=128, w=-1, x=0, y=0, z=0) at ./arch/x86_64/syscall_arch.h:61
#1  0x00007f8dba343ccd in __futex4_cp (to=0x0, val=-1, op=128, addr=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at src/thread/__timedwait.c:24
#2  __timedwait_cp (addr=addr@entry=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, val=val@entry=-1, clk=clk@entry=0, at=at@entry=0x0, priv=priv@entry=128) at src/thread/__timedwait.c:52
#3  0x00007f8dba343d74 in __timedwait (addr=addr@entry=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, val=-1, clk=clk@entry=0, at=at@entry=0x0, priv=128) at src/thread/__timedwait.c:68
#4  0x00007f8dba3463e6 in __pthread_rwlock_timedrdlock (at=<optimized out>, rw=<optimized out>) at src/thread/pthread_rwlock_timedrdlock.c:18
#5  __pthread_rwlock_timedrdlock (rw=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, at=0x0) at src/thread/pthread_rwlock_timedrdlock.c:3
#6  0x00007f8d398f3ca8 in std::__glibcxx_rwlock_rdlock (__rwlock=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:73
#7  std::__shared_mutex_pthread::lock_shared (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:224
#8  std::shared_mutex::lock_shared (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:421
#9  std::shared_lock<std::shared_mutex>::shared_lock (this=0x7f4ca05a2ac0, __m=...) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:722
#10 LibrariesInfoCache::DlIteratePhdrImpl (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>, callback=0x7f8d3997d900 <_Ux86_64_dwarf_callback>, data=0x7f4ca05a2b20) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LibrariesInfoCache.cpp:104
#11 0x00007f8d3997e4ee in _Ux86_64_dwarf_find_proc_info (as=0x7f8d39eb2a00 <local_addr_space>, ip=140246691112115, pi=0x7f4ca05a3170, need_unwind_info=1, arg=0x7f4ca05a3411) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gfind_proc_info-lsb.c:807
#12 0x00007f8d3997e690 in fetch_proc_info (c=0x7f4ca05a3018, ip=140246691112115) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gparser.c:473
#13 0x00007f8d3998113d in find_reg_state (sr=0x7f4ca05a2dc0, c=0x7f4ca05a3018) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gparser.c:1024
#14 _Ux86_64_dwarf_step (c=c@entry=0x7f4ca05a3018) at /project/obj/libunwind-prefix/src/libunwind/src/dwarf/Gparser.c:1069
#15 0x00007f8d3997d13a in _Ux86_64_step (cursor=0x7f4ca05a3018) at /project/obj/libunwind-prefix/src/libunwind/src/x86_64/Gstep.c:75
#16 0x00007f8d398f55c8 in LinuxStackFramesCollector::CollectStackManually (this=this@entry=0x7f8d392dc6d0, ctx=ctx@entry=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:288
#17 0x00007f8d398f53dc in LinuxStackFramesCollector::CollectCallStackCurrentThread (this=this@entry=0x7f8d392dc6d0, ctx=ctx@entry=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:227
#18 0x00007f8d398f4672 in LinuxStackFramesCollector::CollectStackSampleSignalHandler (signal=<optimized out>, info=<optimized out>, context=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:373
#19 0x00007f8d398fb871 in ProfilerSignalManager::CallCustomHandler (this=0x7f8d39eaf928 <ProfilerSignalManager::Get(int)::signalManagers+1944>, signal=10, info=0x7f4ca05a39b0, context=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/ProfilerSignalManager.cpp:197
#20 ProfilerSignalManager::SignalHandler (signal=10, info=0x7f4ca05a39b0, context=0x7f4ca05a3880) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/ProfilerSignalManager.cpp:188
#21 <signal handler called>
#22 __pthread_rwlock_unlock (rw=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at src/thread/pthread_rwlock_unlock.c:5
#23 0x00007f8d398f3bf9 in std::__glibcxx_rwlock_unlock (__rwlock=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:77
#24 std::__shared_mutex_pthread::unlock (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:208
#25 std::shared_mutex::unlock (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/shared_mutex:417
#26 std::unique_lock<std::shared_mutex>::unlock (this=0x7f4ca05a3e20) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/bits/unique_lock.h:194
#27 std::unique_lock<std::shared_mutex>::~unique_lock (this=0x7f4ca05a3e20) at /usr/lib/gcc/x86_64-alpine-linux-musl/10.3.1/../../../../include/c++/10.3.1/bits/unique_lock.h:103
#28 LibrariesInfoCache::UpdateCache (this=0x7f8d39eaf048 <LibrariesInfoCache::Get()::Instance>) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LibrariesInfoCache.cpp:88
#29 0x00007f8d398f4e59 in LinuxStackFramesCollector::CollectStackSampleImplementation (this=0x7f8d3b91bc90, pThreadInfo=0x7f4ca06b9900, pHR=0x7f8d3a63c510, selfCollect=true) at /p--Type <RET> for more, q to quit, c to continue without paging--
roject/profiler/src/ProfilerEngine/Datadog.Profiler.Native.Linux/LinuxStackFramesCollector.cpp:100
#30 0x00007f8d399637ba in StackFramesCollectorBase::CollectStackSample (this=0x7f8d3b91bc90, pThreadInfo=0x7f4ca06b9900, pHR=0x7f4ca05a3fdc) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native/StackFramesCollectorBase.cpp:185
#31 0x00007f8d3992acb9 in ExceptionsProvider::OnExceptionThrown (this=0x7f8d392a7160, thrownObjectId=139969739182080) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native/ExceptionsProvider.cpp:149
#32 0x00007f8d39917045 in CorProfilerCallback::ExceptionThrown (this=0x7f8d392c0d20, thrownObjectId=139969739182080) at /project/profiler/src/ProfilerEngine/Datadog.Profiler.Native/CorProfilerCallback.cpp:1734
```
## Implementation details

- move the call which updates the cache after acquiring the thread lock
- call Update before sending signal

## Test coverage

## Other details
<!-- Fixes #{issue} -->

<!-- ⚠️ Note: where possible, please obtain 2 approvals prior to
merging. Unless CODEOWNERS specifies otherwise, for external teams it is
typically best to have one review from a team member, and one review
from apm-dotnet. Trivial changes do not require 2 reviews. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:tracer The core tracer library (Datadog.Trace, does not include OpenTracing, native code, or integrations) breaking-change type:enhancement Improvement to an existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants