-
Notifications
You must be signed in to change notification settings - Fork 3.7k
PIP-55: Refresh Authentication Credentials #6074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // In case of proxy, if the authentication credentials are forwardable, | ||
| // it will hold the credentials of the original client | ||
| AuthenticationState originalAuthState; | ||
| private boolean pendingAuthChallengeResponse = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be marked volatile?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable will be always accessed only the connection IO thread so there won't be any race condition.
|
@merlimat awesome feature! Overall LGTM. One small comment. |
|
@jiazhai can you review this change since you were implementing SASL authentication before? |
31fc3c2 to
f959116
Compare
f959116 to
d07a926
Compare
ivankelly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment needs to be updated. Otherwise looks good.
| if (jwt.getBody().getExpiration() != null) { | ||
| this.expiration = jwt.getBody().getExpiration().getTime(); | ||
| } else { | ||
| this.expiration = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why not set to Long.MAX_VALUE and make the check simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| private final NettySslContextBuilder sslCtxRefresher; | ||
| private final ServiceConfiguration brokerConf; | ||
|
|
||
| private final Cache<SocketAddress, ServerCnx> connections = Caffeine.newBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Why not Cache<ServerCnx, ServerCnx> and avoid possibly allocating the SocketAddress? We don't care about the keys in any case. Maybe add a comment that the you're using Cache to avoid having to clean stuff up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The socket address is already kept in the ServerCnx itself, so it will not be a new allocation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment
| try { | ||
| cnx.refreshAuthenticationCredentials(); | ||
| } catch (Throwable t) { | ||
| log.warn("[{}] Failed to refresh auth credentials", cnx.getRemoteAddress()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're using Cache to hold the connections, and the lifetime of the entry in connections is based on whether it's been GC'd, this will likely spam the logs. Maybe only log if after throwing, the cnx is still in connected state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we get the ref to cnx, it will be a strong ref to it, so we're not going to loose it during the iteration. Then, inside that method, i'm checking for
if (getState() != State.Connected || !isActive) {
// Connection is either still being established or already closed.
return;
}to avoid getting errors on a connections that's already gone
| String authRole = useOriginalAuthState ? originalPrincipal : this.authRole; | ||
| AuthData brokerData = authState.authenticate(clientData); | ||
|
|
||
| // authentication has completed, will send newConnected command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is now wrong. I think the flow should change a little here, to only call completeConnect if it's the initial auth. There seems to be enough difference in the flows to have them completely separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the comment and tried to make the flow a bit clearer
| authState = authenticationProvider.newAuthState(clientData, remoteAddress, sslSession); | ||
| authenticationData = authState.getAuthDataSource(); | ||
| doAuthentication(clientData, clientProtocolVersion, clientVersion); | ||
| state = doAuthentication(clientData, clientProtocolVersion, clientVersion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not for this patch, but it seems strange that this is a synchronous call, given that authentication could be calling out to a remote system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, initially all auth methods were completely stateless and therefore didn't require any asynchronousness in the provider API. This is not true now in general and we'll have to make a bit of a refactor to achieve that.
| optional FeatureFlags feature_flags = 10; | ||
| } | ||
|
|
||
| message FeatureFlags { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why a feature flag rather than bumping the version? Because some auth will not support reauth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bumping the version is a very primitive way of checking the supported features. It worked well when the client implementation was only Java and started to be problematic after that.
The reality is that all the clients out there will have a different set of supported features. Also, if someone just updates thePulsarApi.proto it looks like that client is now at the latest version, while in reality it might not be able to understand a specific command.
Exchanging the list of supported features will allow client to selectively implement features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding FeatureFlags makes sense. 👍
rdhabalia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's also close #3705 once this PR is merged. No one reviewed that PR when it was introduced last year when we didn't have other auth-mechanism implemented.
| category = CATEGORY_AUTHENTICATION, | ||
| doc = "Interval of time for checking for expired authentication credentials" | ||
| ) | ||
| private int authenticationRefreshCheckSeconds = 60; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should disable it by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will "disabled by default" in the sense that the current credentials are not expiring as of now.
| optional FeatureFlags feature_flags = 10; | ||
| } | ||
|
|
||
| message FeatureFlags { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding FeatureFlags makes sense. 👍
|
|
||
| if (!supportsAuthenticationRefresh()) { | ||
| log.warn("[{}] Closing connection because client doesn't support auth credentials refresh", remoteAddress); | ||
| ctx.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if client doesn't support then let's not change the behavior and don't close the connection, because closing connection will interrupt the client side processing and it requires them to upgrade client lib immediately if client wants to fix auto disconnect. So, we don't want our client to complain for such changed behavior which requires them to upgrade client lib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't close the connection for older clients, it becomes a security loophole.
Also, keep in mind that this is only apply if the credentials are expiring. In that case, a client will typically be reading the credentials each time are needed (eg: from a file or through a supplier function).
If that was not the case, it would have trouble even without this change. (eg: when a broker gets restarted it will not be able to reconnect anymore).
After the connection gets closed, an older client will be anyway able to immediately reconnect and refresh the authentication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change will not work for us because we have usecase where application creates large number of producers on one connection and if the client library is not upgraded then it closes the connection and disrupts all producers connection. this application is latency sensitive and we would like to avoid reconnection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the client library is not upgraded then it closes the connection and disrupts all producers connection.
This is only if the credentials are expiring and if the authentication provider enforces that (which right now will only be the token auth provider).
Avoiding to challenge older client defeats the purpose of this feature in that it opens a backdoor to completely avoid the check.
If older clients are a concern, I'd suggest to either:
- not use this feature (or similarly working feature)
- use non-expiring credentials for older clients
- use credentials with long expire time to minimize the number of reconnections
- force clients to upgrade version
I honestly don't see any better solution for that. I'd categorically exclude the "ignore old clients" for the false sense of security and the massive sec issue it opens.
|
@jerrypeng @rdhabalia can you review this pull request again? Matteo has addressed your comments. |
|
@rdhabalia any thoughts on Matteo's last comment? |
@sijie I think he's on vacation now. I'll sync-up with him once he's back. |
|
Add label release-2.5.1, due to #6178 dependency |
* PIP-55: Refresh Authentication Credentials * Fixed import order * Do not check for original client credential if it's not coming through proxy * Fixed import order * Fixed mocked test assumption * Addressed comments * Avoid to print NPE on auth refresh check if auth is disabled (cherry picked from commit 4af5223)
* PIP-55: Refresh Authentication Credentials * Fixed import order * Do not check for original client credential if it's not coming through proxy * Fixed import order * Fixed mocked test assumption * Addressed comments * Avoid to print NPE on auth refresh check if auth is disabled (cherry picked from commit 4af5223)
* PIP-55: Refresh Authentication Credentials * Fixed import order * Do not check for original client credential if it's not coming through proxy * Fixed import order * Fixed mocked test assumption * Addressed comments * Avoid to print NPE on auth refresh check if auth is disabled (cherry picked from commit 4af5223)
Motivation This doc PR is updated for configurations for PRs: #6716 #6853 #6074 1: The broker configuration (for #6716) is updated by Jia Zhai. 2: Add other supported configurations to the client, standlone and proxy configuration docs based on the client.config, standlone.config and proxy.config files. Modifications 1: Add TLS with keystore type config in standlone and proxy configuration file. 2: update reference > pulsar configuration > client for PIP-55: Refresh Authentication Credentials Add other supported configurations to the standlone and proxy configuration files based on the standlone.config and proxy.config files.
This PR is to update docs for PIP-55: #6074 ### Motivation provide general doc description about implementing the authentication refreshing functionality. ### Modifications Update the Security overview for PIP 55. the `authenticationRefreshCheckSeconds` config has been added through the PR: #6074 (cherry picked from commit 29b81ed)
* PIP-55: Refresh Authentication Credentials * Fixed import order * Do not check for original client credential if it's not coming through proxy * Fixed import order * Fixed mocked test assumption * Addressed comments * Avoid to print NPE on auth refresh check if auth is disabled
This PR is to update docs for PIP-55: apache#6074 ### Motivation provide general doc description about implementing the authentication refreshing functionality. ### Modifications Update the Security overview for PIP 55. the `authenticationRefreshCheckSeconds` config has been added through the PR: apache#6074
Motivation This doc PR is updated for configurations for PRs: apache#6716 apache#6853 apache#6074 1: The broker configuration (for apache#6716) is updated by Jia Zhai. 2: Add other supported configurations to the client, standlone and proxy configuration docs based on the client.config, standlone.config and proxy.config files. Modifications 1: Add TLS with keystore type config in standlone and proxy configuration file. 2: update reference > pulsar configuration > client for PIP-55: Refresh Authentication Credentials Add other supported configurations to the standlone and proxy configuration files based on the standlone.config and proxy.config files.
Motivation
Implemented the authentication refreshing functionality.
Notes: