Skip to content

Ext-proc: Change the default logging status onGrpcClose()#41383

Closed
melginaldi wants to merge 3 commits intoenvoyproxy:mainfrom
melginaldi:extproc-logging-grpcstatus
Closed

Ext-proc: Change the default logging status onGrpcClose()#41383
melginaldi wants to merge 3 commits intoenvoyproxy:mainfrom
melginaldi:extproc-logging-grpcstatus

Conversation

@melginaldi
Copy link
Copy Markdown
Contributor

Commit Message: Ext-proc: Change the default logging status from ABORTED to OK for onGrpcClose()

Additional Description: onGrpcClose() is called when the external processing server half-closes with an OK status. Even though the stream is being closed the correct status to report is OK not ABORTED. If there was an error the Aborted error can be propagated with onGrpcError().

Risk Level: Low
Docs Changes: N/A
Platform Specific Features: N/A

Release Notes: If onGrpcClose() with status=OK, save the grpc status code OK instead of Aborted.

/assign @yanjunxiang-google

…Close()

Signed-off-by: Melissa Ginaldi <mginaldi@google.com>
Signed-off-by: Melissa Ginaldi <mginaldi@google.com>
@repokitteh-read-only
Copy link
Copy Markdown

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #41383 was opened by melginaldi.

see: more, trace.

Signed-off-by: Melissa Ginaldi <mginaldi@google.com>
@melginaldi melginaldi marked this pull request as ready for review October 6, 2025 20:12
}

void Filter::onGrpcClose() { onGrpcCloseWithStatus(Grpc::Status::Aborted); }
void Filter::onGrpcClose() { onGrpcCloseWithStatus(Grpc::Status::Ok); }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are doing here is confusing, and the parameter is just used by clearAsyncState, which was Abort before https://github.com/envoyproxy/envoy/pull/40808/files, @yanjunxiang-google, is clearing Abort the correct behavior for onGrpcClose() before?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that onGrpcClose() use to return status ABORT just like all of the fail-open cases. However, I think this is the wrong status to propagate. Tracing back onGrpcClose() is called here: https://github.com/envoyproxy/envoy/blob/main/source/extensions/filters/common/ext_proc/grpc_client_impl.h#L201 where the status is OK. I did not find another reference where the status was not OK as onGrpcError is called in such a case.

I should have included this in my previous PR (#40808) when I changed the fail-open statuses but I missed it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I meant the

void ProcessorState::clearAsyncState() {
  onFinishProcessorCall(Grpc::Status::Aborted);

It will be always clear and onFinishProcessorCall(Grpc::Status::Aborted); even for a onGrpcClose() call, so my q now, is the behavior the correct one before your previous PR? if yes, changing to Ok will change the behavior of the ext_proc.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, before the previous PR, Envoy set the call status into ABORTED in some scenarios regardless the actual gRPC status. This caused some logging issue. These series PRs are trying to log the right gRPC call status.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Yanjun, These PRs are trying to fix the old functionality which would only ever report ABORTED.

From my understanding, any server half close that has an error will call onGrpcError() so onGrpcClose() will only happen when the server-half closed with an OK status. However, if we feel the correct status in aborted because it was a server half close I can close this PR and delete. I also sent out #41691 which will add a boolean to extProcLogging which will track when sever half closes take place. I still feel logging OK is the current action since that was the actual grpc status but I do not want to break any backward compatability

}

void Filter::onGrpcClose() { onGrpcCloseWithStatus(Grpc::Status::Aborted); }
void Filter::onGrpcClose() { onGrpcCloseWithStatus(Grpc::Status::Ok); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

I am wondering how difficult it is to add the gRPC status parameter in onGrpcClose() , like change it into onGrpcClose(Grpc::Status::GrpcStatus status). That way it's much safer to do this: Filter::onGrpcClose(Grpc::Status::GrpcStatus status) { onGrpcCloseWithStatus(status); }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, but in order to change this method onGrpcClose() I would need to change the original definition since it is overriding the method: https://github.com/envoyproxy/envoy/blob/main/source/extensions/filters/common/ext_proc/grpc_client.h#L22

This would require also changing any files that override the method. I am happy to do this if we find this is the best way forward. In my previous PR I created onGrpcCloseWithStatus() to avoid this change though.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, adding the gRPC status parameter like Filter::onGrpcClose(Grpc::Status::GrpcStatus status) is the right way to go. It's error-proof. You can also remove onGrpcCloseWithStatus().

@botengyao
Copy link
Copy Markdown
Member

will defer to @yanjunxiang-google for final check.

@yanjunxiang-google
Copy link
Copy Markdown
Contributor

will defer to @yanjunxiang-google for final check.

Yeah, let's wait for @melginaldi to address the open comments.

@botengyao
Copy link
Copy Markdown
Member

/wait-any

@melginaldi
Copy link
Copy Markdown
Contributor Author

will defer to @yanjunxiang-google for final check.

Yeah, let's wait for @melginaldi to address the open comments.

Apologies. I forgot about this PR and am now just getting back to it

@agrawroh
Copy link
Copy Markdown
Member

agrawroh commented Nov 3, 2025

/wait

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions Bot added the stale stalebot believes this issue/PR has not been touched recently label Dec 24, 2025
@github-actions
Copy link
Copy Markdown

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions Bot closed this Dec 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale stalebot believes this issue/PR has not been touched recently waiting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants