Skip to content

Conversation

@merlimat
Copy link
Contributor

Motivation

Implemented the authentication refreshing functionality.

Notes:

  • Works with mutual auth providers
  • Works through Pulsar proxy (when client credentials are forwarded to broker)

@merlimat merlimat added the type/feature The PR added a new feature or issue requested a new feature label Jan 17, 2020
@merlimat merlimat added this to the 2.6.0 milestone Jan 17, 2020
@merlimat merlimat self-assigned this Jan 17, 2020
// In case of proxy, if the authentication credentials are forwardable,
// it will hold the credentials of the original client
AuthenticationState originalAuthState;
private boolean pendingAuthChallengeResponse = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be marked volatile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable will be always accessed only the connection IO thread so there won't be any race condition.

@jerrypeng
Copy link
Contributor

@merlimat awesome feature! Overall LGTM. One small comment.

@sijie sijie requested a review from jiazhai January 17, 2020 05:55
@sijie
Copy link
Member

sijie commented Jan 17, 2020

@jiazhai can you review this change since you were implementing SASL authentication before?

@merlimat merlimat force-pushed the token-refresh-auth-master branch from f959116 to d07a926 Compare January 17, 2020 19:29
Copy link
Contributor

@ivankelly ivankelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment needs to be updated. Otherwise looks good.

if (jwt.getBody().getExpiration() != null) {
this.expiration = jwt.getBody().getExpiration().getTime();
} else {
this.expiration = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why not set to Long.MAX_VALUE and make the check simpler?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

private final NettySslContextBuilder sslCtxRefresher;
private final ServiceConfiguration brokerConf;

private final Cache<SocketAddress, ServerCnx> connections = Caffeine.newBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why not Cache<ServerCnx, ServerCnx> and avoid possibly allocating the SocketAddress? We don't care about the keys in any case. Maybe add a comment that the you're using Cache to avoid having to clean stuff up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The socket address is already kept in the ServerCnx itself, so it will not be a new allocation here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment

try {
cnx.refreshAuthenticationCredentials();
} catch (Throwable t) {
log.warn("[{}] Failed to refresh auth credentials", cnx.getRemoteAddress());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're using Cache to hold the connections, and the lifetime of the entry in connections is based on whether it's been GC'd, this will likely spam the logs. Maybe only log if after throwing, the cnx is still in connected state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we get the ref to cnx, it will be a strong ref to it, so we're not going to loose it during the iteration. Then, inside that method, i'm checking for

   if (getState() != State.Connected || !isActive) {
            // Connection is either still being established or already closed.
            return;
        }

to avoid getting errors on a connections that's already gone

String authRole = useOriginalAuthState ? originalPrincipal : this.authRole;
AuthData brokerData = authState.authenticate(clientData);

// authentication has completed, will send newConnected command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is now wrong. I think the flow should change a little here, to only call completeConnect if it's the initial auth. There seems to be enough difference in the flows to have them completely separate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the comment and tried to make the flow a bit clearer

authState = authenticationProvider.newAuthState(clientData, remoteAddress, sslSession);
authenticationData = authState.getAuthDataSource();
doAuthentication(clientData, clientProtocolVersion, clientVersion);
state = doAuthentication(clientData, clientProtocolVersion, clientVersion);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not for this patch, but it seems strange that this is a synchronous call, given that authentication could be calling out to a remote system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, initially all auth methods were completely stateless and therefore didn't require any asynchronousness in the provider API. This is not true now in general and we'll have to make a bit of a refactor to achieve that.

optional FeatureFlags feature_flags = 10;
}

message FeatureFlags {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a feature flag rather than bumping the version? Because some auth will not support reauth?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping the version is a very primitive way of checking the supported features. It worked well when the client implementation was only Java and started to be problematic after that.

The reality is that all the clients out there will have a different set of supported features. Also, if someone just updates thePulsarApi.proto it looks like that client is now at the latest version, while in reality it might not be able to understand a specific command.

Exchanging the list of supported features will allow client to selectively implement features.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding FeatureFlags makes sense. 👍

Copy link
Contributor

@rdhabalia rdhabalia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also close #3705 once this PR is merged. No one reviewed that PR when it was introduced last year when we didn't have other auth-mechanism implemented.

category = CATEGORY_AUTHENTICATION,
doc = "Interval of time for checking for expired authentication credentials"
)
private int authenticationRefreshCheckSeconds = 60;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should disable it by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will "disabled by default" in the sense that the current credentials are not expiring as of now.

optional FeatureFlags feature_flags = 10;
}

message FeatureFlags {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding FeatureFlags makes sense. 👍


if (!supportsAuthenticationRefresh()) {
log.warn("[{}] Closing connection because client doesn't support auth credentials refresh", remoteAddress);
ctx.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if client doesn't support then let's not change the behavior and don't close the connection, because closing connection will interrupt the client side processing and it requires them to upgrade client lib immediately if client wants to fix auto disconnect. So, we don't want our client to complain for such changed behavior which requires them to upgrade client lib.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't close the connection for older clients, it becomes a security loophole.

Also, keep in mind that this is only apply if the credentials are expiring. In that case, a client will typically be reading the credentials each time are needed (eg: from a file or through a supplier function).
If that was not the case, it would have trouble even without this change. (eg: when a broker gets restarted it will not be able to reconnect anymore).

After the connection gets closed, an older client will be anyway able to immediately reconnect and refresh the authentication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change will not work for us because we have usecase where application creates large number of producers on one connection and if the client library is not upgraded then it closes the connection and disrupts all producers connection. this application is latency sensitive and we would like to avoid reconnection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the client library is not upgraded then it closes the connection and disrupts all producers connection.

This is only if the credentials are expiring and if the authentication provider enforces that (which right now will only be the token auth provider).

Avoiding to challenge older client defeats the purpose of this feature in that it opens a backdoor to completely avoid the check.

If older clients are a concern, I'd suggest to either:

  • not use this feature (or similarly working feature)
  • use non-expiring credentials for older clients
  • use credentials with long expire time to minimize the number of reconnections
  • force clients to upgrade version

I honestly don't see any better solution for that. I'd categorically exclude the "ignore old clients" for the false sense of security and the massive sec issue it opens.

@sijie
Copy link
Member

sijie commented Feb 4, 2020

@jerrypeng @rdhabalia can you review this pull request again? Matteo has addressed your comments.

@sijie sijie added the doc-required Your PR changes impact docs and you will update later. label Feb 4, 2020
@sijie
Copy link
Member

sijie commented Feb 10, 2020

@rdhabalia any thoughts on Matteo's last comment?

@merlimat
Copy link
Contributor Author

@rdhabalia any thoughts on Matteo's last comment?

@sijie I think he's on vacation now. I'll sync-up with him once he's back.

@merlimat merlimat merged commit 4af5223 into apache:master Feb 10, 2020
@tuteng
Copy link
Member

tuteng commented Mar 21, 2020

Add label release-2.5.1, due to #6178 dependency

tuteng pushed a commit to AmateurEvents/pulsar that referenced this pull request Mar 21, 2020
* PIP-55: Refresh Authentication Credentials

* Fixed import order

* Do not check for original client credential if it's not coming through proxy

* Fixed import order

* Fixed mocked test assumption

* Addressed comments

* Avoid to print NPE on auth refresh check if auth is disabled

(cherry picked from commit 4af5223)
tuteng pushed a commit that referenced this pull request Apr 13, 2020
* PIP-55: Refresh Authentication Credentials

* Fixed import order

* Do not check for original client credential if it's not coming through proxy

* Fixed import order

* Fixed mocked test assumption

* Addressed comments

* Avoid to print NPE on auth refresh check if auth is disabled

(cherry picked from commit 4af5223)
jiazhai pushed a commit to jiazhai/pulsar that referenced this pull request May 18, 2020
* PIP-55: Refresh Authentication Credentials

* Fixed import order

* Do not check for original client credential if it's not coming through proxy

* Fixed import order

* Fixed mocked test assumption

* Addressed comments

* Avoid to print NPE on auth refresh check if auth is disabled

(cherry picked from commit 4af5223)
jiazhai pushed a commit that referenced this pull request Jun 27, 2020
This PR is to update docs for PIP-55: #6074

### Motivation

provide general doc description about implementing the authentication refreshing functionality.

### Modifications

Update the Security overview for PIP 55.

the `authenticationRefreshCheckSeconds` config has been added through the PR: #6074
jiazhai pushed a commit that referenced this pull request Jul 10, 2020
Motivation
This doc PR is updated for configurations for PRs:
#6716
#6853
#6074

1: The broker configuration (for #6716) is updated by Jia Zhai.

2: Add other supported configurations to the client, standlone and proxy configuration docs based on the client.config, standlone.config and proxy.config files.

Modifications
1: Add TLS with keystore type config in standlone and proxy configuration file.
2: update reference > pulsar configuration > client for PIP-55: Refresh Authentication Credentials
Add other supported configurations to the standlone and proxy configuration files based on the standlone.config and proxy.config files.
wolfstudy pushed a commit that referenced this pull request Jul 29, 2020
This PR is to update docs for PIP-55: #6074

### Motivation

provide general doc description about implementing the authentication refreshing functionality.

### Modifications

Update the Security overview for PIP 55.

the `authenticationRefreshCheckSeconds` config has been added through the PR: #6074 

(cherry picked from commit 29b81ed)
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
* PIP-55: Refresh Authentication Credentials

* Fixed import order

* Do not check for original client credential if it's not coming through proxy

* Fixed import order

* Fixed mocked test assumption

* Addressed comments

* Avoid to print NPE on auth refresh check if auth is disabled
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
This PR is to update docs for PIP-55: apache#6074

### Motivation

provide general doc description about implementing the authentication refreshing functionality.

### Modifications

Update the Security overview for PIP 55.

the `authenticationRefreshCheckSeconds` config has been added through the PR: apache#6074
huangdx0726 pushed a commit to huangdx0726/pulsar that referenced this pull request Aug 24, 2020
Motivation
This doc PR is updated for configurations for PRs:
apache#6716
apache#6853
apache#6074

1: The broker configuration (for apache#6716) is updated by Jia Zhai.

2: Add other supported configurations to the client, standlone and proxy configuration docs based on the client.config, standlone.config and proxy.config files.

Modifications
1: Add TLS with keystore type config in standlone and proxy configuration file.
2: update reference > pulsar configuration > client for PIP-55: Refresh Authentication Credentials
Add other supported configurations to the standlone and proxy configuration files based on the standlone.config and proxy.config files.
@Anonymitaet Anonymitaet added doc-complete Your PR changes impact docs and the related docs have been already added. and removed doc-required Your PR changes impact docs and you will update later. labels Feb 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-complete Your PR changes impact docs and the related docs have been already added. release/2.5.1 type/feature The PR added a new feature or issue requested a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants