Skip to content

Conversation

@dprotaso
Copy link
Member

@dprotaso dprotaso commented Jun 25, 2025

This stacks onto #3188

@knative-prow knative-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 25, 2025
@knative-prow knative-prow bot requested review from Cali0707 and skonto June 25, 2025 13:11
@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 25, 2025
@dprotaso dprotaso force-pushed the observability-sharedmain branch from 8fcfcd7 to 174f42a Compare June 25, 2025 13:39
@dprotaso dprotaso force-pushed the observability-sharedmain branch from 174f42a to 412b5ee Compare June 25, 2025 14:18
@dprotaso dprotaso mentioned this pull request Jun 25, 2025
@dprotaso dprotaso force-pushed the observability-sharedmain branch 2 times, most recently from 1135ba7 to b7cc628 Compare June 25, 2025 15:08
@codecov
Copy link

codecov bot commented Jun 25, 2025

Codecov Report

Attention: Patch coverage is 62.96296% with 10 lines in your changes missing coverage. Please review.

Project coverage is 75.91%. Comparing base (5e2512c) to head (db62f56).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
observability/resource/default.go 77.27% 3 Missing and 2 partials ⚠️
system/env.go 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3190      +/-   ##
==========================================
- Coverage   75.96%   75.91%   -0.05%     
==========================================
  Files         205      205              
  Lines       11690    11709      +19     
==========================================
+ Hits         8880     8889       +9     
- Misses       2539     2547       +8     
- Partials      271      273       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments about usage, but this makes sense to me overall.

logger.Fatalw("Failed to setup trace provider", zap.Error(err))
}

otel.SetTextMapPropagator(tracing.DefaultTextMapPropagator())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should be part of tracing.NewTracerProvider, even though it's actually a global spooky-action-at-a-distance manipulation. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracing.NewTracerProvider

That doesn't actually set the global tracer in NewTracerProvider. We do that in sharedmain after setting the SetTextMapPropagator.

I prefer the observability package to not set any OTel globals.

if _, err := client.Get(ctx, cmName, metav1.GetOptions{}); err == nil {
cmw.Watch(cmName, observers...)
} else if !apierrors.IsNotFound(err) {
logger.Fatalw("Error reading ConfigMap "+cmName, zap.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming we don't have a way to notice if the ConfigMap shows up later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file could use a cleanup - I've just changed the 'Watch' here but the diff shows a larger change.

Generally, I think the controllers should halt and wait if a config map is not present. I could imagine a split brain scenario occurring.

I'd rather not make those edits in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. I'm slighly worried about "halt and catch fire" as a behavior when a configmap we ship has been removed. The idea of shipping them has mostly been about documentation, not about requiring them to have specific contents (or even be present) for things to function.

@dprotaso dprotaso force-pushed the observability-sharedmain branch from b7cc628 to db3e763 Compare June 26, 2025 04:09
@dprotaso dprotaso changed the title [wip] OTel injection/sharedmain changes OTel injection/sharedmain changes Jun 26, 2025
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 26, 2025
@dprotaso
Copy link
Member Author

I'm marking this ready - we can merge this and it will 'break' metrics/tracing @ HEAD - but I think that's worth it since it'll be easier to review and make follow up changes.

Plus metrics/tracing would have already been 'broken' downstream if we merged pkg changes


// Ignore the error because it complains about semconv
// schema version differences
resource, _ := resource.Merge(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there shouldn't be an error here anymore since the telemetry sdk attributes and resource now use the same semconv version.

But before if the semconv packages were different it would return this error with a resource without a schema
https://github.com/open-telemetry/opentelemetry-go/blob/957f4417b7719eeff17f051228520176fd2d5b78/sdk/resource/resource.go#L218

There was a bug in v1.36.0 where they didn't match up which was odd.

Copy link
Member Author

@dprotaso dprotaso Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test and now we panic if there is an error.

We can always change this in the future if we find it annoying

@dprotaso
Copy link
Member Author

/hold

Actually I want to make another related change

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2025
Copy link
Member

@Cali0707 Cali0707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits, but overall lgtm

dprotaso added 3 commits June 26, 2025 14:28
Given Views in OTel are overrides for end-users this should really be a
sharedmain option instead of a dedicated package. In OpenCensus views
were required but in OTel they are not
@dprotaso
Copy link
Member Author

Talking to the OTel folks - views are an 'end-user' override. Unlike OpenCensus they are not mandatory to get metrics export to work.

So it seems more fitting to remove the global package I created and just make it a sharedmain option that controller authors can change.

@dprotaso
Copy link
Member Author

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2025
@dprotaso dprotaso changed the title OTel injection/sharedmain changes [injection/sharedmain] OTel Support Jun 26, 2025
Copy link
Member

@Cali0707 Cali0707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold in case @dprotaso wants more feedback from others

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2025
@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2025
@knative-prow
Copy link

knative-prow bot commented Jun 26, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Cali0707, dprotaso

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with this as-is, but I had a couple comments that didn't get sent this morning.

if _, err := client.Get(ctx, cmName, metav1.GetOptions{}); err == nil {
cmw.Watch(cmName, observers...)
} else if !apierrors.IsNotFound(err) {
logger.Fatalw("Error reading ConfigMap "+cmName, zap.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. I'm slighly worried about "halt and catch fire" as a behavior when a configmap we ship has been removed. The idea of shipping them has mostly been about documentation, not about requiring them to have specific contents (or even be present) for things to function.

@knative-prow knative-prow bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 26, 2025
Copy link
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jun 27, 2025
@dprotaso
Copy link
Member Author

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 27, 2025
@knative-prow knative-prow bot merged commit 5abfb10 into knative:main Jun 27, 2025
36 of 38 checks passed
@dprotaso dprotaso deleted the observability-sharedmain branch June 27, 2025 00:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants