fix(tracing): use noop spans when tracing disabled#5118
fix(tracing): use noop spans when tracing disabled#5118siavashs merged 1 commit intoprometheus:mainfrom
Conversation
This change ensures that when tracing is disabled, we use noop spans instead of creating actual spans that are then discarded. This avoids allocating memory for spans that will never be used. Signed-off-by: Siavash Safi <siavash@cloudflare.com>
📝 WalkthroughWalkthroughThis PR refactors OpenTelemetry tracer initialization across multiple Alertmanager modules by replacing direct Changes
Sequence DiagramsequenceDiagram
participant Config as Configuration
participant Manager as tracing.Manager
participant Flag as tracingEnabled Flag
participant Factory as NewTracer()
participant Tracer as conditionalTracer
participant OTel as OTel Provider
participant Module as Alertmanager Module
Config->>Manager: ApplyConfig(cfg)
alt cfg.Endpoint != ""
Manager->>Flag: Set true
Manager->>OTel: Install OTel Tracer Provider
else cfg.Endpoint == ""
Manager->>Flag: Set false
Manager->>OTel: Install noop Tracer Provider
end
Module->>Factory: NewTracer("module/name")
Factory->>Tracer: Create conditionalTracer
Tracer-->>Module: Return tracer instance
Module->>Tracer: Start(ctx, spanName, ...)
alt tracingEnabled == true
Tracer->>OTel: Delegate to OTel tracer
OTel-->>Tracer: Return real span + context
else tracingEnabled == false
Tracer-->>Module: Return noopSpan + context
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tracing/tracing.go (1)
103-127:⚠️ Potential issue | 🟠 MajorClear the tracing state before rebuilding the provider.
After
m.shutdownFunc()returns, the old shutdown function and the previoustracingEnabledvalue stay live until the new provider is fully installed. IfbuildTracerProviderfails,conditionalTracer.Startstill behaves as if tracing is enabled, and the next reload will try to shut the already-torn-down provider down again.Proposed fix
func (m *Manager) ApplyConfig(cfg TracingConfig) error { // Update only if a config change is detected. If TLS configuration is // set, we have to restart the manager to make sure that new TLS // certificates are picked up. var blankTLSConfig commoncfg.TLSConfig if reflect.DeepEqual(m.config, cfg) && (m.config.TLSConfig == nil || *m.config.TLSConfig == blankTLSConfig) { return nil } + wasEnabled := tracingEnabled.Load() + tracingEnabled.Store(false) if m.shutdownFunc != nil { if err := m.shutdownFunc(); err != nil { + tracingEnabled.Store(wasEnabled) return fmt.Errorf("failed to shut down the tracer provider: %w", err) } + m.shutdownFunc = nil } // If no endpoint is set, assume tracing should be disabled. if cfg.Endpoint == "" { - tracingEnabled.Store(false) m.config = cfg - m.shutdownFunc = nil otel.SetTracerProvider(noop.NewTracerProvider()) m.logger.Info("Tracing provider uninstalled.") return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tracing/tracing.go` around lines 103 - 127, After calling and awaiting m.shutdownFunc() in Reload (or the function that reinstalls the tracer), immediately clear the previous tracing state so a failed rebuild won't leave tracing marked enabled or keep an old shutdown func: set tracingEnabled.Store(false), set m.shutdownFunc = nil, and install a noop tracer provider via otel.SetTracerProvider(noop.NewTracerProvider()) (and update m.config accordingly). Then proceed to call buildTracerProvider(ctx, cfg) and only on success assign m.shutdownFunc, m.config and tracingEnabled.Store(true). This ensures conditionalTracer.Start and subsequent reloads see tracing as disabled while the new provider is being built.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tracing/tracing.go`:
- Around line 43-60: The Start method in conditionalTracer returns the original
ctx when tracing is disabled, which violates trace.Tracer.Start's contract;
update conditionalTracer.Start (the method on type conditionalTracer that checks
tracingEnabled.Load()) to return a context that has noopSpan attached (use the
OpenTelemetry helper that derives a new context with the provided Span) and
return noopSpan as the Span so downstream callers (e.g.,
provider/mem.Alerts.Put) get a context carrying the noop span.
---
Outside diff comments:
In `@tracing/tracing.go`:
- Around line 103-127: After calling and awaiting m.shutdownFunc() in Reload (or
the function that reinstalls the tracer), immediately clear the previous tracing
state so a failed rebuild won't leave tracing marked enabled or keep an old
shutdown func: set tracingEnabled.Store(false), set m.shutdownFunc = nil, and
install a noop tracer provider via
otel.SetTracerProvider(noop.NewTracerProvider()) (and update m.config
accordingly). Then proceed to call buildTracerProvider(ctx, cfg) and only on
success assign m.shutdownFunc, m.config and tracingEnabled.Store(true). This
ensures conditionalTracer.Start and subsequent reloads see tracing as disabled
while the new provider is being built.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f0fcc991-7cf6-41e7-955f-b0f1adc28167
📒 Files selected for processing (7)
api/v2/api.godispatch/dispatch.goinhibit/inhibit.gonotify/notify.goprovider/mem/mem.gosilence/silence.gotracing/tracing.go
* [CHANGE] `go get github.com/prometheus/alertmanager/ui` will now fail as compiled UI assets are no longer checked into the repository. Downstream builds that rely on these assets being present in the source tree must now build the UI from source. prometheus#5113 * [CHANGE] The '--enable-feature=auto-gomaxprocs' option is deprecated and will be removed in v0.33. This flag currently has no effect and can be safely removed from any startup scripts. prometheus#5090 * [CHANGE] Update internal function signatures across multiple packages. This affects any project that integrates `Alertmanager` code. * [ENHANCEMENT] Add static asset caching. prometheus#5113 * [ENHANCEMENT] Reduce memory allocations through pre-sizing collections and batch allocation. prometheus#5020 * [ENHANCEMENT] Replace help with documentation in navigation bar. prometheus#4943 * [ENHANCEMENT] docs(ha): Update high availability documentation. prometheus#5136 * [ENHANCEMENT] docs: Add `auth_secret_file` for smtp in document. prometheus#5036 * [ENHANCEMENT] docs: Add description for global `telegram_bot_token`. prometheus#5114 * [ENHANCEMENT] docs: Add note about notifier timeouts. prometheus#5077 * [ENHANCEMENT] docs: Fix `force_implicit_tls` config field name. prometheus#5030 * [ENHANCEMENT] docs: Link community supported integrations. prometheus#4978 * [ENHANCEMENT] docs: Remove duplicate header. prometheus#5034 * [ENHANCEMENT] docs: Update mutual tls reference in high availability documentation. prometheus#5120 * [ENHANCEMENT] tracing: Use noop spans when tracing disabled. prometheus#5118 * [FEATURE] Add silence annotations. prometheus#4965 * [FEATURE] Add silence logging option. prometheus#4163 * [FEATURE] Add support for multiple matcher set silences. prometheus#4957 * [FEATURE] Add the reason for notifying in dedup stage. prometheus#4971 * [FEATURE] mattermost: Flatten attachments into top-level config. prometheus#5009 * [FEATURE] mattermost: Support global webhook url. prometheus#4998 * [FEATURE] slack: Add default color from template. prometheus#5014 * [FEATURE] slack: Allow receiver to edit existing messages. prometheus#5007 * [FEATURE] template: Add dict, map and append functions. prometheus#5093 * [FEATURE] webhook: Add full payload templating support for notifier. prometheus#5011 * [BUGFIX] config: Check for empty cluster tls client config. prometheus#5126 * [BUGFIX] config: Don't crash upon reading empty config for notifier. prometheus#4979 * [BUGFIX] config: Fix ipv6 address handling in hostport.string(). prometheus#5040 * [BUGFIX] mattermost: Omit empty text field in notifications. prometheus#4985 * [BUGFIX] telegram: Send fallback message when notification exceeds character limit. prometheus#5074 * [BUGFIX] ui: Fix escaping for matcher values with quotes. prometheus#4862 * [BUGFIX] ui: Handle special chars in silence regex-matchers. prometheus#4942 * [BUGFIX] ui: Support utf-8 label names in matchers. prometheus#5089 Signed-off-by: Solomon Jacobs <solomonjacobs@protonmail.com>
* [CHANGE] `go get github.com/prometheus/alertmanager/ui` will now fail as compiled UI assets are no longer checked into the repository. Downstream builds that rely on these assets being present in the source tree must now build the UI from source. #5113 * [CHANGE] The '--enable-feature=auto-gomaxprocs' option is deprecated and will be removed in v0.33. This flag currently has no effect and can be safely removed from any startup scripts. #5090 * [CHANGE] Update internal function signatures across multiple packages. This affects any project that integrates `Alertmanager` code. * [ENHANCEMENT] Add static asset caching. #5113 * [ENHANCEMENT] Reduce memory allocations through pre-sizing collections and batch allocation. #5020 * [ENHANCEMENT] Replace help with documentation in navigation bar. #4943 * [ENHANCEMENT] docs(ha): Update high availability documentation. #5136 * [ENHANCEMENT] docs: Add `auth_secret_file` for smtp in document. #5036 * [ENHANCEMENT] docs: Add description for global `telegram_bot_token`. #5114 * [ENHANCEMENT] docs: Add note about notifier timeouts. #5077 * [ENHANCEMENT] docs: Fix `force_implicit_tls` config field name. #5030 * [ENHANCEMENT] docs: Link community supported integrations. #4978 * [ENHANCEMENT] docs: Remove duplicate header. #5034 * [ENHANCEMENT] docs: Update mutual tls reference in high availability documentation. #5120 * [ENHANCEMENT] tracing: Use noop spans when tracing disabled. #5118 * [FEATURE] Add silence annotations. #4965 * [FEATURE] Add silence logging option. #4163 * [FEATURE] Add support for multiple matcher set silences. #4957 * [FEATURE] Add the reason for notifying in dedup stage. #4971 * [FEATURE] mattermost: Flatten attachments into top-level config. #5009 * [FEATURE] mattermost: Support global webhook url. #4998 * [FEATURE] slack: Add default color from template. #5014 * [FEATURE] slack: Allow receiver to edit existing messages. #5007 * [FEATURE] template: Add dict, map and append functions. #5093 * [FEATURE] webhook: Add full payload templating support for notifier. #5011 * [BUGFIX] config: Check for empty cluster tls client config. #5126 * [BUGFIX] config: Don't crash upon reading empty config for notifier. #4979 * [BUGFIX] config: Fix ipv6 address handling in hostport.string(). #5040 * [BUGFIX] mattermost: Omit empty text field in notifications. #4985 * [BUGFIX] telegram: Send fallback message when notification exceeds character limit. #5074 * [BUGFIX] ui: Fix escaping for matcher values with quotes. #4862 * [BUGFIX] ui: Handle special chars in silence regex-matchers. #4942 * [BUGFIX] ui: Support utf-8 label names in matchers. #5089 Signed-off-by: Solomon Jacobs <solomonjacobs@protonmail.com>
This change ensures that when tracing is disabled, we use noop spans instead of creating actual spans that are then discarded. This avoids allocating memory for spans that will never be used.
Pull Request Checklist
Please check all the applicable boxes.
benchstatto compare benchmarksWhich user-facing changes does this PR introduce?
Summary by CodeRabbit