Extension points for authentication/authorization by jon-wei · Pull Request #4271 · apache/druid

jon-wei · 2017-05-13T01:02:37Z

Extension points for authentication/authorization

This PR implements several enhancements to the Druid security system with the following goals in mind:

Decouple the authentication logic from the authorization logic
Continue to allow extensibility for the decoupled authentication and authorization logic
Allow multiple authentication mechanisms to be used simultaneously

Existing Design

The following section describes the existing procedure for creating an extension that handles authentication and authorization:

Define a ServletFilterHolder in the extension that performs authentication checks on intercepted HTTP requests, in the Filter returned by getFilter()
After authenticating the request, the Filter attaches an AuthorizationInfo object to the HTTP request. The AuthorizationInfo object is assumed to contain the authenticated identity and any other information needed from the request to perform authorization checks.
The endpoint for the request retrives the AuthorizationInfo object, calling isAuthorized() on Resource/Action pairs that describe the access request.

Some limitations of the current system are:

The authentication filter is currently responsible for creating the AuthorizationInfo decision object, so changing which authorization implementation is used requires code changes to any authentication filter being used.
Only one authentication method can be active at a given time (a user could enable multiple authentication filters at once, but a single request would have to pass all of the authentication filters that were enabled)
Fail-open behavior, if a security-sensitive endpoint fails to perform authorization checks due to a bug or design omission, requests to that endpoint will be allowed by default.

PR Change Summary

Adds two new extensible interfaces for security logic, Authenticator and Authorizer
Refactors existing security checks in various endpoints to use these new interfaces instead of the old AuthorizationInfo
Adds security checks to endpoints that did not have them (e.g., KafkaIndexTask)
Adds logic to create a chain of Authenticators for managing multiple authentication schemes
Adds two security sanity check filters (described in "Authorization Validation" below)

Example Implementation

An extension that uses these interfaces to provide support for HTTP Basic authentication and a simple RBAC authorization system can be found here:
jon-wei#1

Authentication and Authorization

Authenticator

This interface is essentially a ServletFilterHolder with additional requirements on the getFilter() method contract, plus:

A method that returns a WWW-Authenticate challenge header appropriate for the authentication mechanism.
A method for creating a wrapped HTTP client that can authenticate using the Authenticator's authentication scheme, used for internal Druid node communications (e.g., broker -> historical messages)
A method for authenticating credentials contained in a JDBC connection context, used for authenticating Druid SQL requests received via JDBC

  /**
   * @return The type name of this authenticator. Should be identical to the JsonTypeInfo type.
   */
  public String getTypeName();

  /**
   * Create a Filter that performs authentication checks on incoming HTTP requests.
   *
   * If the authentication succeeds, the Filter should set the "Druid-Auth-Token" attriabute in the request,
   * containing a String that represents the authenticated identity of the requester.
   *
   * If the "Druid-Auth-Token" attribute is already set (i.e., request has been authenticated by an earlier Filter),
   * this Filter should skip any authentication checks and proceed to the next Filter.
   *
   * If the authentication fails, the Filter should not send an error response. The error response will be sent
   * after all Filters in the authentication filter chain have been checked.
   *
   * If an anonymous request is received, the Filter should continue on to the next Filter, the challenge response
   * will be sent after the filter chain is exhausted.
   *
   * @return Filter that authenticates HTTP requests
   */
  public Filter getFilter();

  /**
   * Return a WWW-Authenticate challenge scheme string appropriate for this Authenticator's authentication mechanism.
   *
   * For example, a Basic HTTP implementation should return "Basic", while a Kerberos implementation would return
   * "Negotiate".
   *
   * @return Authentication scheme
   */
  public String getAuthChallengeHeader();

  /**
   * Given a JDBC connection context, authenticate the identity represented by the information in the context.
   * This is used to secure JDBC access for Druid SQL.
   *
   * For example, a Basic HTTP auth implementation could read the "user" and "password" fields from the JDBC context.
   *
   * The expected contents of the context are left to the implementation.
   *
   * @param context JDBC connection context
   * @return true if the identity represented by the context is successfully authenticated
   */
  public boolean authenticateJDBCContext(Map<String, Object> context);

  /**
   * Return a client that sends requests with the format/information necessary to authenticate successfully
   * against this Authenticator's authentication scheme using the identity of the internal system user.
   *
   * This HTTP client is used for internal communications between Druid nodes, such as when a broker communicates
   * with a historical node during query processing.
   *
   * @param baseClient Base HTTP client for internal Druid communications
   * @return HttpClient that sends requests with the credentials of the internal system user
   */
  public HttpClient createInternalClient(HttpClient baseClient);

Authentication Chain

To determine what authentication mechanisms are to be used, the user should specify a list of Authenticator type names in the config, which will be used to instantiate a chain of filters by calling getFilter() on the registered Authenticators.

A Filter that handles failed authentication checks will always be placed at the end of the filter chain. If the Druid-Auth-Token attribute is not set, but the request was not anonymous (had an authentication header), a 403 Forbidden error response will be sent. If an anonymous request is received, this filter will build a WWW-Authenticate header for each Authenticator by calling getAuthChallengeHeader() and add these to the 401 Unauthorized error response that will be sent, providing the client with a list of supported HTTP authentication schemes. If authentication succeeded, this filter will do nothing.

A Filter that performs a sanity check on requests will always be placed at the start of the filter chain. This filter will check that the Druid-Auth-Token attribute is not set in the request (i.e., the client trying to fake an authentication result). The filter will also check for a "Druid-Auth-Token-Checked" attribute (described later in the proposal). An error response will be sent if either of these attributes are seen.

Authorizer

An Authorizer is responsible for performing authorization checks for resource accesses.

public interface Authorizer
{
  /**
   * Check if the entity represented by `identity` is authorized to perform `action` on `resource`.
   *
   * @param identity The identity of the requester
   * @param resource The resource to be accessed
   * @param action The action to perform on the resource
   * @return An Access object representing the result of the authorization check.
   */
  public Access authorize(String identity, Resource resource, Action action);
}

This interface is intended to replace the current AuthorizationInfo. A single instance of each Authorizer will be created per node.

Security-sensitive endpoints will need to extract the identity string contained in the request's Druid-Auth-Token attribute, previously set by an Authenticator. Each endpoint will pass this identity String to the Authorizer's authorize() method along with any Resource/Action pairs created for the request being handled.

The endpoint can use these checks to filter out resources or deny the request as needed.

After a request is authorized, a new attribute, "Druid-Auth-Token-Checked", should be set in the request header with the result of the authorization decision.

Authorization Validation

Another servlet filter will be defined to check that all requests undergo authorization if security features are enabled.

This filter will be applied after a request has been processed by an endpoint but before the response is sent to the client. If the "Druid-Auth-Token-Checked" attribute is not set in the request, then an error response will be sent instead of the actual response.

This helps to reduce the instances of fail-open behavior. However, this mechanism is imperfect as any state changes that took place during the request handling will still take effect, even if the response is killed. This feature is more intended to help detect authorization bugs or design omissions.

Namespaces

Authenticator and Authorizer implementations are linked through a namespace string. Authenticators tag an authenticated request with a namespace, which is used to route the authenticated request to the Authorizer implementation that registered itself with a matching namespace.

This is to support cases where an Authorizer implementation is only intended to authorize requests from a specific authenticator (an implementation may have assumptions about the user name format, for example).

The details of namespace configuration are left for implementors of Authenticator and Authorizer to decide.

The namespace header field is "Druid-Auth-Token-Namespace" and contains a String. Namespace mapping is handled by the AuthorizerMapper class.

pjain1 · 2017-05-13T16:24:34Z

@jon-wei I just read the proposal and it sounds great. I am not sure if I missed it but I do not find much information about the Basic HTTP Authentication mechanism that would be built-in, can you provide some details on that ? A side note that the RBAC system that we use with Druid was open sourced this week at www.athenz.io in case you are interested to have a look.

nishantmonu51 · 2017-05-15T06:37:30Z

havn't looked at the code changes yet, went through the proposal and It seems good,
few comments -

AuthorizationManager interface - usesBuiltInTables looks like exposing details for one of the implementations, is it possible to remove this ?
Row/Column based authorization - the complete feature implementation may be out of scope for this but we should think about how we are prosing to achieve that going forward since we are introducing the new authorization interface to be able to support that.
User/Role/Permission Management - Would be nice if this can also be pluggable, where one default implementation is to store the users/roles in druid metadata store, which can be later be extended to manage these on other projects like Apache Ranger. HIVE has added these methods to the Authorizer interface itself -
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/HiveAuthorizer.java
We can choose to do the same or add them in a separate interface which custom authorization providers can opt-in to implement.

himanshug · 2017-05-15T19:45:07Z

like the proposal and I agree with @nishantmonu51 on not having usesBuiltInTable() method or any other hint of the implementation if possible and do the default implementation in a core extension, that will also serve as a good reference to others if they want to integrate druid with their own authentication and authorization infrastructure.

i loved the idea of enforcing the check that the request passed through authorization and didn't silently succeed.

jon-wei · 2017-05-16T04:47:19Z

Thanks for the feedback so far!

@pjain1 I'll add more documentation for that part

@nishantmonu51 I'll add some thoughts to the "Future Work" section on row/column level authorization.

Regarding point 3, I think that's reasonable to support for cases where the external security store's model aligns closely enough with the core security model, that would allow such similar systems to reuse the coordinator API endpoints

@himanshug Moving this into a core extension sounds good, I'll need to take another look at the MySQL/Postgres support and see if there any complications from that

I'll think about some ways to get rid of useBuiltInTables().

gianm · 2017-05-16T05:28:08Z

@nishantmonu51

Row/Column based authorization - the complete feature implementation may be out of scope for this but we should think about how we are prosing to achieve that going forward since we are introducing the new authorization interface to be able to support that.

Maybe we could do this through views. Like, create a view with a particular set of columns and a particular filter, and then grant access to that view but not to the base dataSource.

jon-wei · 2017-05-17T00:17:33Z

@pjain1 I updated the proposal with some more details on the Basic HTTP auth implementation.

@nishantmonu51 I added a section to Future Work regarding row/column-level authorization. I'm leaning towards supporting that through a "View" system.

Regarding extensibility of the RBAC model, after thinking more on this, my current stance is that someone wishing to plug in their own RBAC system should implement their own AuthorizationManager vs. providing additional extension points within the built-in AuthorizationManager.

The data model used by other systems would generally be more sophisticated than the basic RBAC model proposed here (for example, the Athenz system linked by @pjain1 has more concepts like domain namespaces and services).

I don't think we could capture every data model variation if the extension point is at the User/Role/Permission level, and it would be cleaner/more useful to not impose such assumptions on an extension implementer.

@himanshug I revised the proposal to mention that the built-in implementations will be contained inside a core extension.

@nishantmonu51 @himanshug re: useBuiltInTables() is eliminated from the proposed interfaces, the coordinator APIs will now be disabled unless the core extension's implementations are in use.

gianm · 2017-05-17T00:29:20Z

The data model used by other systems would generally be more sophisticated than the basic RBAC model proposed here (for example, the Athenz system linked by @pjain1 has more concepts like domain namespaces and services).

I don't think we could capture every data model variation if the extension point is at the User/Role/Permission level, and it would be cleaner/more useful to not impose such assumptions on an extension implementer.

I think I agree with this -- that way we aren't forcing people to use an identical RBAC model to our "standard" model for authorization.

Although, this is apparently the road Hive went down. @nishantmonu51, do you have a sense about how successful that has been? Do implementations usually implement all the HiveAuthorizer methods and do you know if there have been issues mapping that onto systems whose authorization models aren't the same as Hive's?

moumny · 2017-05-18T06:31:18Z

Regarding having modular RBAC system, an example of that would be Apache Ranger.

gianm

Reviewed changes since the last commit.

gianm · 2017-09-15T17:42:06Z

+ * take care of sending the response.
 */
-public class SystemAuthorizationInfo implements AuthorizationInfo
+public class ForbiddenException extends SecurityException


This should extend RuntimeException directly, rather than SecurityException. This is because SecurityException is an exception type reserved for the JDK (it's "thrown by the security manager to indicate a security violation").

gianm · 2017-09-15T17:42:11Z

+    catch (ForbiddenException e) {
+      // don't do anything for an authorization failure, ForbiddenExceptionMapper will catch this later and
+      // send an error response if this is thrown.
+      Throwables.propagate(e);


The return context.gotError(e) here is ignored, because Throwables.propagate always throws an exception. From docs for Throwables.propagate:

This method always throws an exception. The RuntimeException return type is only for client code to make Java type system happy in case a return value is required by the enclosing method.

Also, see this doc for some common pitfalls of using Throwables.propagate: https://github.com/google/guava/wiki/Why-we-deprecated-Throwables.propagate

In this case, since ForbiddenExceptionMapper is going to do all of the handling of sending an error message, the try/catch isn't necessary at all. So it's better to skip the catch block entirely.

gianm · 2017-09-15T17:54:54Z

+    log.error(errorMsg);
+
+    // Send out an alert so there's a centralized collection point for seeing errors of this nature
+    log.makeAlert(errorMsg);


This needs a couple of tweaks.

There must be an .emit() after makeAlert(...) or else the alert will not go anywhere.

The log.error is not useful since it will just double-log (log.makeAlert(...).emit() will also do an error log).

gianm · 2017-09-15T18:16:43Z

+        loginContext.logout();
+      }
+      catch (LoginException ex) {
+        log.warn(ex.getMessage(), ex);


This is backwards. It should be log.warn(ex, ex.getMessage()).

gianm · 2017-09-15T18:17:03Z

+          loginContext.login();
+        }
+        catch (LoginException le) {
+          log.warn("Failed to login as [{}]", spnegoPrincipal, le);


This is backwards too.

gianm

Noticed some minor doc and annotations issues.

gianm · 2017-09-15T18:46:57Z

-   * @param authorizationInfo authorization info from the request; or null if none is present. This must be non-null
-   *                          if security is enabled, or the request will be considered unauthorized.
+   * @param user              authentication token from the request
+   * @param namespace         authentication namespace of the request


Some of these parameters don't exist, please make sure the javadoc is up to date.

gianm · 2017-09-15T18:47:39Z

  public <T> Sequence<T> runSimple(
      final Query<T> query,
-      @Nullable final AuthorizationInfo authorizationInfo,
+      @Nullable final AuthenticationResult authenticationResult,


It seems like authenticationResult is not actually nullable, so remove the annotation.

gianm · 2017-09-15T18:48:45Z

-  public Access authorize(@Nullable final AuthorizationInfo authorizationInfo)
+   * */
+  public Access authorize(
+      @Nullable final AuthenticationResult authenticationResult


It seems like authenticationResult is not actually nullable, so remove the annotation.

gianm · 2017-09-15T18:48:56Z

-   * @param authorizationInfo authorization info from the request; or null if none is present. This must be non-null
-   *                          if security is enabled, or the request will be considered unauthorized.
+   * @param token authentication token from the request
+   * @param namespace namespace of the authentication token


The params here look out of date.

gianm · 2017-09-15T21:05:33Z

  {
    for (Authenticator authenticator : authenticators) {
-      FilterHolder holder = new FilterHolder(authenticator.getFilter());
+      FilterHolder holder = new FilterHolder(


The AuthenticationWrappingFilter change should involve a small doc change to how the authentication chain works (first one to identify the user is respected, and then others are skipped).

gianm · 2017-09-15T21:09:45Z

+  {
    // Send out an alert so there's a centralized collection point for seeing errors of this nature
-    log.makeAlert(errorMsg);
+    log.makeAlert(errorMsg).emit();


This should be split into something like log.makeAlert(errorMsg).addData("uri", uri).addData("method", method).emit(). The reason is that having a consistent message helps identify these after the alerts have been collected. They're easy to search for and then the "uri" field can be inspected to discover which endpoint is bad. "method" is important too so we can differentiate get/post/delete.

jihoonson · 2017-09-16T00:45:33Z

Looks good to me. Nice work!

gianm

Latest patch LGTM. @jon-wei can you please confirm that you tested this patch on a live cluster too, and then I'll merge it!

jon-wei · 2017-09-16T05:03:22Z

@gianm I've verified this patch on our test cluster

@himanshug @nishantmonu51 @pjain1 @gianm @jihoonson Thanks a lot for the reviews!

leventov · 2017-11-15T18:44:31Z

      }
    }

+    // Since we can't see the request object on the remote side, we can't check whether the remote side actually


@jon-wei "remote" in this comment means upstream remote (something, that send a request to router), or downstream remote (broker)? Also, in the sentense

If the remote node failed to perform an authorization check, will log that on the remote node.

apparently the same "remote node" is referred, but in both past tense: "failed" and future tense: "will log", like this code (clientRequest.setAttribute(AuthConfig.DRUID_AUTHORIZATION_CHECKED, true);) happens between something else happening on "remote".

I couldn't really understand what is going on here

Also, don't understand what does "we can't see the request object on the remote side" mean.

"remote" in that comment refers to the proxy forwarding target (the brokers)

If the remote node failed to perform an authorization check, will log that on the remote node.

Suppose the router forwards a request to a broker, and due to a bug, the broker does not actually perform any authorization checks for that request.

The request will eventually go through the PreResponseAuthorizationCheckFilter on that broker, which will see that no authorization check was performed, and log an error.

"we can't see the request object on the remote side"

This refers to how the router has no visibility into whether the broker has set the DRUID_AUTHORIZATION_CHECKED attribute on the request object that the broker is handling.

clientRequest.setAttribute(AuthConfig.DRUID_AUTHORIZATION_CHECKED, true);

This is there so that "clientRequest" does not fail the validation checks in PreResponseAuthorizationCheckFilter on the router. The "real" authorization check occurs on the broker that receives a forwarded request.

This refers to how the router has no visibility into whether the broker has set the DRUID_AUTHORIZATION_CHECKED attribute on the request object that the broker is handling.

I don't understand, how this is relevant? If router would magically be able to know, how it would change anything?

This is there so that "clientRequest" does not fail the validation checks in PreResponseAuthorizationCheckFilter on the router. The "real" authorization check occurs on the broker that receives a forwarded request.

Does this mean that on the query, that arrives to the broker, DRUID_AUTHORIZATION_CHECKED is false again, despite this clientRequest.setAttribute(AuthConfig.DRUID_AUTHORIZATION_CHECKED, true); line?

Is this right, that this line effective just suppresses auth check on the router, because it's done on the broker? If so, why even adding PreResponseAuthorizationCheckFilter to the pipeline on router?

I don't understand, how this is relevant? If router would magically be able to know, how it would change anything?

If the router could magically know, it could conceivably have clientRequest's authorization check status match the authorization check status of the proxy request (not that crucial, it would give you some extra information on the router side that there is a potential authorization bug on the forwarding target)

The main point of the comment is just to indicate that the real authorization check occurs at the forwarding destination.

Does this mean that on the query, that arrives to the broker, DRUID_AUTHORIZATION_CHECKED is false again, despite this clientRequest.setAttribute(AuthConfig.DRUID_AUTHORIZATION_CHECKED, true); line?

Yes, the broker sees the request represented by "proxyRequest", not "clientRequest". Attributes are not transmitted in HTTP requests on the wire anyway, they're only for keeping internal server-side state related to the request.

Is this right, that this line effective just suppresses auth check on the router, because it's done on the broker? If so, why even adding PreResponseAuthorizationCheckFilter to the pipeline on router?

Yes, it's there to suppress auth checks on the router side for forwarded requests.

There are other endpoints on the router that do need authorization, like /status and /druid/v1/brokers (which is actually missing authorization checks as of this comment, and should be fixed)

leventov · 2018-10-11T17:38:25Z

+        authorizerMapper
+    );

-    if (authConfig.isEnabled()) {


Since this change, the authConfig field in MetadataResource is unused

jon-wei added the Feature label May 13, 2017

nishantmonu51 self-requested a review May 15, 2017 06:37

nishantmonu51 self-assigned this May 15, 2017

jon-wei force-pushed the new_security branch 2 times, most recently from 057814c to badcd34 Compare May 17, 2017 21:51

jon-wei force-pushed the new_security branch 8 times, most recently from 068473e to 1846033 Compare May 19, 2017 20:45

jon-wei force-pushed the new_security branch 7 times, most recently from df533ad to 7d48800 Compare June 7, 2017 01:28

jon-wei force-pushed the new_security branch 2 times, most recently from cc58cfe to 83f4dff Compare July 14, 2017 22:13

jon-wei force-pushed the new_security branch 3 times, most recently from 13f5b40 to 880e6b4 Compare September 15, 2017 08:52

PR comments

eff666c

jon-wei force-pushed the new_security branch from 880e6b4 to eff666c Compare September 15, 2017 09:06

jon-wei added 3 commits September 15, 2017 02:17

Merge remote-tracking branch 'upstream/master' into new_security

8f3ae31

Fix test

baae14c

Fix IT

99a4972

gianm reviewed Sep 15, 2017

View reviewed changes

More PR comments

e833bef

jon-wei force-pushed the new_security branch from c653a28 to e833bef Compare September 15, 2017 20:53

gianm reviewed Sep 15, 2017

View reviewed changes

PR comments

0f44de3

jon-wei force-pushed the new_security branch from b27cb1f to 0f44de3 Compare September 15, 2017 23:39

jihoonson approved these changes Sep 16, 2017

View reviewed changes

gianm approved these changes Sep 16, 2017

View reviewed changes

SSL fix

9724acc

gianm merged commit c2a0e75 into apache:master Sep 16, 2017

jon-wei mentioned this pull request Sep 17, 2017

Remove check for multiple authorization #4816

Closed

a2l007 mentioned this pull request Sep 17, 2017

Add druid-basic-security extension jon-wei/druid#1

Closed

jon-wei mentioned this pull request Sep 19, 2017

Basic auth extension #4823

Closed

jon-wei mentioned this pull request Sep 28, 2017

Druid 0.11.0 release notes #4876

Closed

jon-wei deleted the new_security branch October 6, 2017 21:40

leventov reviewed Nov 15, 2017

View reviewed changes

jon-wei mentioned this pull request Nov 17, 2017

Basic auth extension #5099

Merged

clambertus unassigned nishantmonu51 Jul 6, 2018

leventov reviewed Oct 11, 2018

View reviewed changes

Conversation

jon-wei commented May 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Extension points for authentication/authorization

Existing Design

PR Change Summary

Example Implementation

Authentication and Authorization

Authenticator

Authentication Chain

Authorizer

Authorization Validation

Namespaces

Uh oh!

pjain1 commented May 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nishantmonu51 commented May 15, 2017

Uh oh!

himanshug commented May 15, 2017

Uh oh!

jon-wei commented May 16, 2017

Uh oh!

gianm commented May 16, 2017

Uh oh!

jon-wei commented May 17, 2017

Uh oh!

gianm commented May 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

moumny commented May 18, 2017

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jihoonson commented Sep 16, 2017

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

jon-wei commented Sep 16, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

jon-wei commented May 13, 2017 •

edited

Loading

pjain1 commented May 13, 2017 •

edited

Loading

gianm commented May 17, 2017 •

edited

Loading