Integrating cosmos diagnostics with open telemetry tracer#22202
Integrating cosmos diagnostics with open telemetry tracer#22202simplynaveen20 merged 18 commits intoAzure:mainfrom
Conversation
...rc/main/java/com/azure/cosmos/implementation/query/DefaultDocumentQueryExecutionContext.java
Outdated
Show resolved
Hide resolved
...osmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DocumentProducer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
high level looks good. great work Naveen.
my comments:
- why are we introducing back pkRangeId? can't we rely on pkRange/feedRange instead?
- on the public api we expose period of time as duration elsewhere. we should try to be consistent.
- We should avoid adding more BridgeInternal methods if possible.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/BridgeInternal.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/BridgeInternal.java
Outdated
Show resolved
Hide resolved
.../azure-cosmos/src/main/java/com/azure/cosmos/implementation/ClientSideRequestStatistics.java
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosItemRequestOptions.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosItemRequestOptions.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosQueryRequestOptions.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosQueryRequestOptions.java
Outdated
Show resolved
Hide resolved
...rc/main/java/com/azure/cosmos/implementation/query/DefaultDocumentQueryExecutionContext.java
Outdated
Show resolved
Hide resolved
...rc/main/java/com/azure/cosmos/implementation/query/DefaultDocumentQueryExecutionContext.java
Outdated
Show resolved
Hide resolved
|
This pull request is protected by Check Enforcer. What is Check Enforcer?Check Enforcer helps ensure all pull requests are covered by at least one check-run (typically an Azure Pipeline). When all check-runs associated with this pull request pass then Check Enforcer itself will pass. Why am I getting this message?You are getting this message because Check Enforcer did not detect any check-runs being associated with this pull request within five minutes. This may indicate that your pull request is not covered by any pipelines and so Check Enforcer is correctly blocking the pull request being merged. What should I do now?If the check-enforcer check-run is not passing and all other check-runs associated with this PR are passing (excluding license-cla) then you could try telling Check Enforcer to evaluate your pull request again. You can do this by adding a comment to this pull request as follows: What if I am onboarding a new service?Often, new services do not have validation pipelines associated with them, in order to bootstrap pipelines for a new service, you can issue the following command as a pull request comment: |
mbhaskar
left a comment
There was a problem hiding this comment.
Thanks for taking time to discuss and add keeping the feedrange in metrics. LGTM
moderakh
left a comment
There was a problem hiding this comment.
Thanks @simplynaveen20 great work. some minor comments/questions, other than that LGTM.
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/util/CosmosPagedFlux.java
Outdated
Show resolved
Hide resolved
| Arrays.asList(schedulingTimeSpanMap) | ||
| ), pageResult.getActivityId()); | ||
| BridgeInternal.putQueryMetricsIntoMap(pageResult, feedRange.getRange().toString(), qm); | ||
| String pkrId = pageResult.getResponseHeaders().get(HttpConstants.HttpHeaders.PARTITION_KEY_RANGE_ID); |
There was a problem hiding this comment.
why can't we rely on the targetRange please see here:
https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DocumentProducer.java#L93
is that going be different than this one from the header response?
| BridgeInternal.putQueryMetricsIntoMap(pageResult, feedRange.getRange().toString(), qm); | ||
| String pkrId = pageResult.getResponseHeaders().get(HttpConstants.HttpHeaders.PARTITION_KEY_RANGE_ID); | ||
| if (StringUtils.isEmpty(pkrId)) { | ||
| pkrId = "0"; |
There was a problem hiding this comment.
on which scenario we come here? is this code reachable?
can't we rely on the targetRage see my above comment for this?
| Arrays.asList(schedulingTimeSpanMap)), | ||
| tFeedResponse.getActivityId()); | ||
| BridgeInternal.putQueryMetricsIntoMap(tFeedResponse, DEFAULT_PARTITION_RANGE, qm); | ||
| String pkrId = tFeedResponse.getResponseHeaders().get(HttpConstants.HttpHeaders.PARTITION_KEY_RANGE_ID); |
There was a problem hiding this comment.
why can't we rely on the targetRange please see here:
https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/query/DocumentProducer.java#L93
is that going be different than this one from the header response?
There was a problem hiding this comment.
We need both feedrange and rangeId in query metric key , it will be easier to diagnose issue for On call, discussed with @mbhaskar please check this comment #22202 (comment)
...rc/main/java/com/azure/cosmos/implementation/query/DefaultDocumentQueryExecutionContext.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/TracerProvider.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosItemRequestOptions.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosQueryRequestOptions.java
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/models/CosmosItemRequestOptions.java
Show resolved
Hide resolved
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run java - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Cosmos public API end to end tracing is already added last year with PR #10265
This PR will add diagnostics information on the span event based in some threshold which can be configured via CosmosItemRequestOptions and CosmosQueryRequestOptions through setThresholdForDiagnosticsOnTracerInMS(). Default is 100 ms(CRUD) and 500ms(Query) . Metadata/Script crud request options does not have the option to set threshold and will take default. If needed we can add the option there as well
Note : -
Some screen shot from Jaeger UI -


Point operation -
Query Operation -


