Skip to content

xds: update trace log to only print non-nullptr exceptions#44026

Merged
botengyao merged 6 commits into
envoyproxy:mainfrom
adisuissa:error_message_print_fix
Mar 26, 2026
Merged

xds: update trace log to only print non-nullptr exceptions#44026
botengyao merged 6 commits into
envoyproxy:mainfrom
adisuissa:error_message_print_fix

Conversation

@adisuissa
Copy link
Copy Markdown
Contributor

Commit Message: xds: update trace log to only print non-nullptr exceptions
Additional Description:
Upon an xDS-error, trace-logs output the exception details. However there are a few cases where an exception is created with nullptr passed.
This PR introduces defensive coding in the cases where the exception details are printed out to the trace logs.

Risk Level: low - only trace-logs are impacted
Testing: Added unit test to avoid regression in this case.
Docs Changes: N/A
Release Notes: N/A - not user facing (trace logs)
Platform Specific Features: N/A

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
@repokitteh-read-only
Copy link
Copy Markdown

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #44026 was opened by adisuissa.

see: more, trace.

@adisuissa adisuissa marked this pull request as ready for review March 19, 2026 14:44
@adisuissa
Copy link
Copy Markdown
Contributor Author

cc @paul-r-gall

botengyao
botengyao previously approved these changes Mar 21, 2026
Copy link
Copy Markdown
Member

@botengyao botengyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix!

/wait

ASSERT(reason != Envoy::Config::ConfigUpdateFailureReason::ConnectionFailure);
ENVOY_LOG(trace, "ODCDS-manager: error while fetching a single resource {}: {}",
resource_name_, e->what());
resource_name_, e == nullptr ? "exception not provided" : e->what());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orz, is this causing crashes?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After re-inspecting the code the assertion above ASSERT(reason != Envoy::Config::ConfigUpdateFailureReason::ConnectionFailure) seems to be incorrect, and was removed.

disableInitFetchTimeoutTimer();
stats_.update_rejected_.inc();
ENVOY_LOG(warn, "gRPC config for {} rejected: {}", type_url_, e->what());
ENVOY_LOG(warn, "gRPC config for {} rejected: {}", type_url_,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above

// We expect Envoy exception to be thrown when update is rejected.
    ASSERT(e != nullptr);

is asserting not nullptr, and if there are cases that e can be nullptr it needs to be removed, here and other places.

I then wonder if we should continue to use the raw pointer other than OptRef.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE OptRef vs. pointers, I'm not sure what the difference is in terms of dereferencing a nullptr vs. dereferencing a non-existent optional value.

I'll look into whether e can be null in this case or not.

Copy link
Copy Markdown
Contributor Author

@adisuissa adisuissa Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further validation, the passed exception can only be nullptr if it is invoked with a reason that is ConnectionFailure (example).
The error-handling code typically distinguishes between the different reasons, for example:

void GrpcSubscriptionImpl::onConfigUpdateFailed(ConfigUpdateFailureReason reason,
.
Not a very safe pattern, but this is what there is at the moment. I've reverted the changes in all the files other than the od-cds subscriptions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that the exception can be nullptr even with other reasons (e.g., FetchTimedout):

onConfigUpdateFailed(Envoy::Config::ConfigUpdateFailureReason::FetchTimedout, nullptr);

I still think that the pointer access pattern should be defensive, but I can see the point of having a clear understanding of the code paths.
Updated the comment accordingly.

@botengyao botengyao dismissed their stale review March 21, 2026 04:54

Clicked the wrong button, it is clearly not an approval ;-)

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Signed-off-by: Adi Suissa-Peleg <adip@google.com>
…_fix

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
wbpcode
wbpcode previously approved these changes Mar 25, 2026
Copy link
Copy Markdown
Member

@wbpcode wbpcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for me. Thanks.

@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Mar 25, 2026

Defer this to @botengyao

@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Mar 25, 2026

Could you check the CI also?

@wbpcode
Copy link
Copy Markdown
Member

wbpcode commented Mar 25, 2026

/wait

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
…_fix

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Copy link
Copy Markdown
Member

@botengyao botengyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@botengyao botengyao merged commit ede8074 into envoyproxy:main Mar 26, 2026
29 checks passed
TAOXUY pushed a commit to TAOXUY/envoy that referenced this pull request Apr 1, 2026
…y#44026)

Commit Message: xds: update trace log to only print non-nullptr
exceptions
Additional Description:
Upon an xDS-error, trace-logs output the exception details. However
there are a few cases where an exception is created with nullptr passed.
This PR introduces defensive coding in the cases where the exception
details are printed out to the trace logs.

Risk Level: low - only trace-logs are impacted
Testing: Added unit test to avoid regression in this case.
Docs Changes: N/A
Release Notes: N/A - not user facing (trace logs)
Platform Specific Features: N/A

---------

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Signed-off-by: Xuyang Tao <taoxuy@google.com>
citrus7 pushed a commit to citrus7/envoy that referenced this pull request Apr 1, 2026
…y#44026)

Commit Message: xds: update trace log to only print non-nullptr
exceptions
Additional Description:
Upon an xDS-error, trace-logs output the exception details. However
there are a few cases where an exception is created with nullptr passed.
This PR introduces defensive coding in the cases where the exception
details are printed out to the trace logs.

Risk Level: low - only trace-logs are impacted
Testing: Added unit test to avoid regression in this case.
Docs Changes: N/A
Release Notes: N/A - not user facing (trace logs)
Platform Specific Features: N/A

---------

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Signed-off-by: Jonathan Wu <jtwu@google.com>
nshipilov pushed a commit to nshipilov/envoy that referenced this pull request Apr 13, 2026
…y#44026)

Commit Message: xds: update trace log to only print non-nullptr
exceptions
Additional Description:
Upon an xDS-error, trace-logs output the exception details. However
there are a few cases where an exception is created with nullptr passed.
This PR introduces defensive coding in the cases where the exception
details are printed out to the trace logs.

Risk Level: low - only trace-logs are impacted
Testing: Added unit test to avoid regression in this case.
Docs Changes: N/A
Release Notes: N/A - not user facing (trace logs)
Platform Specific Features: N/A

---------

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Signed-off-by: Nick Shipilov <nick.shipilov.n@gmail.com>
krinkinmu pushed a commit to grnmeira/envoy that referenced this pull request Apr 20, 2026
…y#44026)

Commit Message: xds: update trace log to only print non-nullptr
exceptions
Additional Description:
Upon an xDS-error, trace-logs output the exception details. However
there are a few cases where an exception is created with nullptr passed.
This PR introduces defensive coding in the cases where the exception
details are printed out to the trace logs.

Risk Level: low - only trace-logs are impacted
Testing: Added unit test to avoid regression in this case.
Docs Changes: N/A
Release Notes: N/A - not user facing (trace logs)
Platform Specific Features: N/A

---------

Signed-off-by: Adi Suissa-Peleg <adip@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants