Skip to content
This repository was archived by the owner on Sep 26, 2023. It is now read-only.

Conversation

@igorbernstein2
Copy link
Contributor

@igorbernstein2 igorbernstein2 commented Oct 22, 2018

DON'T MERGE

This is a rough sketch of opencensus integration. I wanted to get some early feedback on the direction I'm taking.

The general idea here is to create an opencensus span for each toplevel operation the user invokes. Then to annotate that span with various interesting events like retry attempts.

All of the logic to create spans and annotations is centralized in a Tracer. A Tracer wraps an opencensus Span. The Tracer is generally created by outer callables like TracedUnaryCallable and is passed through the callable chains via the ApiCallContext.

The main things I would like to get some feedback on are:

  1. Tracer interface: it centralizes all of the annotations as opposed to exposing the raw opencensus apis.
  2. The statefulness of the OpencensusTracer impl. It's has some smarts like counting requests and responses that it will use in a future annotation.
  3. Keeping all tracing related classes in the tracing package (including the Callable impls)
  4. The addition of couldRetry to the RetryAlgorithm` to disambiguate between running out of retry attempts and the error not being retryable

TODO

  • Finish implementation
  • Add tracing for HTTP
  • Add credentials refresh annotations
  • Figure out what to do about traces generated by opentracing stubs

ClientContext gets a TracerFactory, which (for the time being) is set to a dummy factory. To enable tracing OpenCensusTracerFactory can be used. In the future, OpenCensusTracerFactory will become the default. TracerFactory will be used by outer Callables to create new Tracers

ApiCallContext will be used to carry the Tracers through the callable chains to annotate the spans represented by the Tracers
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Oct 22, 2018
@codecov-io
Copy link

codecov-io commented Oct 22, 2018

Codecov Report

Merging #613 into master will decrease coverage by 5.02%.
The diff coverage is 19.75%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #613      +/-   ##
============================================
- Coverage     75.03%   70.01%   -5.03%     
- Complexity      935      957      +22     
============================================
  Files           177      190      +13     
  Lines          4090     4492     +402     
  Branches        323      334      +11     
============================================
+ Hits           3069     3145      +76     
- Misses          869     1192     +323     
- Partials        152      155       +3
Impacted Files Coverage Δ Complexity Δ
...java/com/google/api/gax/tracing/TraceFinisher.java 0% <0%> (ø) 0 <0> (?)
...api/gax/tracing/TracedClientStreamingCallable.java 0% <0%> (ø) 0 <0> (?)
...oogle/api/gax/tracing/OpencensusTracerFactory.java 0% <0%> (ø) 0 <0> (?)
...api/gax/tracing/TracedServerStreamingCallable.java 0% <0%> (ø) 0 <0> (?)
...a/com/google/api/gax/tracing/OpencensusTracer.java 0% <0%> (ø) 0 <0> (?)
...main/java/com/google/api/gax/tracing/SpanName.java 0% <0%> (ø) 0 <0> (?)
...oogle/api/gax/tracing/TracedOperationCallable.java 0% <0%> (ø) 0 <0> (?)
...om/google/api/gax/tracing/TracedUnaryCallable.java 0% <0%> (ø) 0 <0> (?)
...om/google/api/gax/tracing/TracedBatchCallable.java 0% <0%> (ø) 0 <0> (?)
...com/google/api/gax/tracing/TracedBidiCallable.java 0% <0%> (ø) 0 <0> (?)
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0eb9430...df1c6b5. Read the comment docs.

@vam-google vam-google self-requested a review October 31, 2018 19:58
@vam-google vam-google added the do not merge Indicates a pull request not ready for merge, due to either quality or timing. label Oct 31, 2018
Copy link
Contributor

@vam-google vam-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review the actual implementations for TracedCallables (as it is probably too early to do it). Please let me know if you want those to be carefully reviewed as well.

About the 4 specific questions:

  1. Tracer interface is fine, maybe except Type enum in it (please see the corresponding comment for details).
  2. Please avoid the duplication of TimedAttemptSettings stuff at all cost (please check the corresponding comment in OpencensusTracer class). I'm not strictly opposed to the statefulness of the tracer, but I'm opposed to duplicating in it the state of other objects, because there would be no guarantee that the reported values are the actual values.
  3. gax.tracing. package is fine (we are already structured as package-by-feature). As for Traced*Callables, please check the comment in GrpcCallableFactory (I think it is ok to just not have them at all, and merge all the tracing stuff into existing callables).
  4. couldRetry - I really don't like it (especially when there is shouldRetry next to it). It looks like you can avoid declaring this method (and in a way, which will give you more flexibility in terms of reporting). Please check the corresponding comments for details.

setAttemptResult(throwable, response, true);
// a new attempt will be (must be) scheduled by an external executor
} else if (throwable != null) {
if (retryAlgorithm.couldRetry(throwable, response)) {

This comment was marked as spam.

* @param prevResponse response returned by the previous attempt or null if an exception was
* thrown instead
*/
public boolean couldRetry(Throwable prevThrowable, ResponseT prevResponse) {

This comment was marked as spam.


import org.threeten.bp.Duration;

public interface Tracer {

This comment was marked as spam.


import org.threeten.bp.Duration;

public interface Tracer {

This comment was marked as spam.


import org.threeten.bp.Duration;

public interface Tracer {

This comment was marked as spam.


void connectionSelected(int id);

void startAttempt();

This comment was marked as spam.

}

enum Type {
Unary(false, false),

This comment was marked as spam.

try {
context = context.withTracer(tracer);
ApiFuture<ResponseT> future = innerCallable.futureCall(request, context);
ApiFutures.addCallback(future, finisher, MoreExecutors.directExecutor());

This comment was marked as spam.

UnaryCallable<RequestT, ResponseT> callable =
createBaseUnaryCallable(grpcCallSettings, callSettings, clientContext);

callable =

This comment was marked as spam.

private int attempts = 0;
private long attemptRequests = 0;
private long attemptResponses = 0;
private long totalAttemptRequests = 0;

This comment was marked as spam.

@igorbernstein2
Copy link
Contributor Author

Thanks for reviewing, I will incorporate most of the suggestions over the coarse of the week. However there are couple things that I think I need to keep:

  1. I need to retain the Traced callables to actually start the operation spans. There is currently no existing callable that I can modify to insert that functionality. Also I think it's good practice to separate concerns (ie. having a ExceptionCallable separate from direct callable)
  2. I would like to keep the wrapping instrumentation approach. The only place where I couldn't use that approach was the retry logic. I would prefer to update the retrying control flow to allow it to be wrapped instead. Maybe pushing the decision to RetryAlgorithm and have it return descriptive state that can be intercepted.
  3. I would like to be keep the ApiTracer abstraction instead of exposing the opencensus Tracer directly. The interface of the ApiTracer will be a good reference to understand the kind of things being traced
  4. I think I need to keep the enum to distinguish between retry attempts vs polling attempts in my annotations.

igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Dec 27, 2018
This is extracted from googleapis#613 (comment).
Currently for streaming RPCs TimedAttemptSettings only tracks how many attempts occurred trying to receive the next message. Once the message is received, the attempt count is reset.
This adds a new field to TimedAttemptSettings that will track overall count so that it can be used in a future integration with opencensus (or be used in a new retry algorithm)
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Dec 27, 2018
This has been extracted from googleapis#613

This doesn't provide any extra functionality. Instead it just sets up the foundation for integrating tracing in to gax.
The general idea is to add a tracing abstraction layer that all parts of gax can call into. This is accomplished by introducing
- ApiTracer interface that has methods for all annotations that a span should contain
- ApiTracerFactory interface: since an ApiTracer is stateful and has a 1:1 mapping with operations, a factory is introduced to cleanly switch implementations of the tracer

By default a NoopApiTracerFactory is configured that will return NoopApiTracers. In the future a parallel OpenCensusApiTracerFactory will be added. A client implementation will be able to opt into tracing by setting the factory in StubSettings. The actual ApiTracer will be propagated by CallContext.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Dec 28, 2018
This depends on googleapis#633 and is extracted from  googleapis#613.

This introduces TracedUnaryCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 2, 2019
This depends on googleapis#633 and is extracted from googleapis#613.

This introduces TracedServerStreamingCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 3, 2019
This depends on googleapis#633 and is extracted from googleapis#613.

This introduces TracedBidiCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 3, 2019
This depends on googleapis#633 and is extracted from googleapis#613.

This introduces TracedBidiCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 3, 2019
This depends on googleapis#633 and is extracted from googleapis#613.

This introduces TracedClientStreamCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit that referenced this pull request Jan 5, 2019
* Add plumbing for tracing

This has been extracted from #613

This doesn't provide any extra functionality. Instead it just sets up the foundation for integrating tracing in to gax.
The general idea is to add a tracing abstraction layer that all parts of gax can call into. This is accomplished by introducing
- ApiTracer interface that has methods for all annotations that a span should contain
- ApiTracerFactory interface: since an ApiTracer is stateful and has a 1:1 mapping with operations, a factory is introduced to cleanly switch implementations of the tracer

By default a NoopApiTracerFactory is configured that will return NoopApiTracers. In the future a parallel OpenCensusApiTracerFactory will be added. A client implementation will be able to opt into tracing by setting the factory in StubSettings. The actual ApiTracer will be propagated by CallContext.

* rename tracer method to subject action

* address feedback

* address feedback
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 5, 2019
This depends on googleapis#633 and is extracted from  googleapis#613.

This introduces TracedUnaryCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 5, 2019
This depends on googleapis#633 and is extracted from googleapis#613.

This introduces TracedServerStreamingCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 5, 2019
This depends on googleapis#633 and is extracted from googleapis#613.

This introduces TracedBidiCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit to igorbernstein2/gax-java that referenced this pull request Jan 5, 2019
This depends on googleapis#633 and is extracted from googleapis#613.

This introduces TracedClientStreamCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.
igorbernstein2 added a commit that referenced this pull request Jan 11, 2019
* Start tracing unary callable.

This depends on #633 and is extracted from  #613.

This introduces TracedUnaryCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.

* trace http json calls as well

* fix test

* oops

* a bit more work

* improve method name parsing

* address feedback

* trace operation cancellation and use static mockito imports in tests

* address feedback
igorbernstein2 added a commit that referenced this pull request Feb 2, 2019
* Start tracing bidi streaming callables.

This depends on #633 and is extracted from #613.

This introduces TracedBidiCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.

* fixes

* cancellation & static imports

* address some feedback

* address more feedback
igorbernstein2 added a commit that referenced this pull request Feb 20, 2019
* Start tracing client streaming callables.

This depends on #633 and is extracted from #613.

This introduces TracedClientStreamCallables that will create a trace and complete it. It is inserted as one of the outermost links in the callable chain to allow all other callables to contribute their annotations.
google-cloud-java that extend the chains (like cloud bigtable) will also use this class to trace their extensions.

* fixes

* handle operation cancellation & use static imports

* address feedbac

* fix merge

* address feedback

* address feedback
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla: yes This human has signed the Contributor License Agreement. do not merge Indicates a pull request not ready for merge, due to either quality or timing.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants