Enabling datasource level authorization in Druid by pjain1 · Pull Request #2424 · apache/druid

pjain1 · 2016-02-09T16:53:59Z

Fixes #2355. This PR is meant to put necessary abstractions inside Druid for enabling authorization as discussed in #2355

Introduce AuthorizationInfo interface, specific implementations of which would be provided by extensions
If the druid.auth.enabled is set to true then the isAuthorized method of AuthorizationInfo will be called to perform authorization checks
AuthorizationInfo object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as AuthConfig.DRUID_AUTH_TOKEN
As per the scope of this PR, all resources that needs to be secured are divided into 3 types - DATASOURCE, CONFIG and STATE. For any type of resource, possible actions are - READ or WRITE
Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example
- DatasourceResourceFilter is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as /druid/coordinator/v1/datasources/, /druid/coordinator/v1/metadata/datasources/, /druid/v2/datasources/
- RulesResourceFilter is used where the datasource information is present after "rules" segment in the request Path such as /druid/coordinator/v1/rules/
- TaskResourceFilter is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as druid/indexer/v1/task
- ConfigResourceFilter is used for endpoints like /druid/coordinator/v1/config, /druid/indexer/v1/worker, /druid/worker/v1 etc
- StateResourceFilter is used for endpoints like /druid/broker/v1/loadstatus, /druid/coordinator/v1/leader, /druid/coordinator/v1/loadqueue, /druid/coordinator/v1/rules etc
For endpoints where a list of resources is returned like /druid/coordinator/v1/datasources, /druid/indexer/v1/completeTasks etc. the list is filtered to return only the resources to which the requested user has access. In these cases, HttpServletRequest instance needs to be injected in the endpoint method.

Note -
JAX-RS specification provides an interface called SecurityContext. However, we did not use this but provided our own interface AuthorizationInfo mainly because it provides more flexibility. For example, SecurityContext has a method called isUserInRole(String role) which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources.

drcrallen · 2016-02-09T17:27:52Z

There is an interesting mix of standardized and non standardized auth methods here. On one hand the authorization tries to provide a standard resource related framework, but on the other hand it completely relies on the endpoint to do all the auth requesting and logic.

I suspect this is because the resource of interest is contained within the body rather than as part of the http request metadata.

IMHO a "cleaner" solution would be to architect the requests such that any endpoint that touches a sensitive resource must have the appropriate annotations on what resources it touches. Then the auth layer transparently handles the auth based on request metadata, and the duty within the endpoint itself is simply to wire up the authorized resource to the resource actually being used.

Another way to say it is that it would be awesome if by the time the method is called, the auth system has already performed its function, and it is simply the job of the endpoint method to ensure compliance with the auth system's expectations on resource usage.

pjain1 · 2016-02-09T17:52:12Z

Yes the auth checks are performed inside endpoints as many of the Druid endpoints does not follow REST conventions and resource information is inside the body.
So the cleaner approach that you are suggesting is to change the endpoints to follow REST conventions ? Is that what you meant when you said the requests must have appropriate annotations ?

himanshug · 2016-02-09T17:53:16Z

@drcrallen your suggestion is valid, however something that can be done independently of this PR and would probably require druid client API changes.

himanshug · 2016-02-09T17:57:18Z

can we just call it Access ?

drcrallen · 2016-02-09T18:48:43Z

why does this need to be enum?

@drcrallen i think, the set of valid resource types would be fixed by druid-core because druid-core calls in to the action and provides those as arguments, so it makes sense to use enum to represent resource types.
in any case, since we are calling the api experimental and potentially changeable in near future. I wouldn't worry too much about it either ways.

drcrallen · 2016-02-09T18:52:19Z

@himanshug / @pjain1 would you guys feel comfortable calling auth an experimental feature that may change significantly in the near-ish future?

If that's the case then a stop-gap that solves your main pain points should be ok as long as its not intrusive in other scenarios (which I don't think this PR is)

drcrallen · 2016-02-09T18:52:43Z

General PR comments:

Headers missing on some of the files
Why are enums needed instead of just strings?

himanshug · 2016-02-09T18:54:23Z

@drcrallen I'm fine calling it experimental.
sounds like high level approach is ok.
@pjain1 can you make necessary add/updates and remove the "Discuss" when done?

drcrallen · 2016-02-09T18:54:24Z

why not just return boolean?

drcrallen · 2016-02-09T18:56:32Z

@pjain1 please comment in the code where you're adding the auth checks that auth is experimental and reference this PR. ((so future developers know why it is there))

pjain1 · 2016-02-09T20:03:14Z

@drcrallen sure I will put the necessary comments. I will put the headers.
About Enum vs String - I chose enums as it gives more control on what values can be passed in and all the options available to the developer but I also see the point that anyways AuthorizationInfo will be implemented by extensions they can pass in whatever they want. Personally I would prefer enum but I am OK in changing it to String if you have a strong opinion against using enum. BTW what is your reason of not having enums ?

drcrallen · 2016-02-09T20:24:34Z

@pjain1 good points. My reasons for not favoring enums are:

Enums are not very extensible, and IMHO make more sense when either A) there is an explicit requirement for ordering or B) There is an explicit need to limit the options available
I'm not sure how well enums play with classloaders. I'm guessing they don't any more so than other classes. Getting an error message like "DATASOURCE cannot be assigned to type ResourceType" (or similar) is pretty irritating. This can occur if the two enum classes were loaded through different classLoaders.

So basically, if you're trying to force any extensions to ONLY use the values presented, and not intending the security set to be extendable, then there can be a good argument for enums.

drcrallen · 2016-02-09T21:51:20Z

@himanshug I think from #2424 (comment) you're supporting B) from my list above, where you are purposefully limiting the options available. Is that correct?

himanshug · 2016-02-09T22:38:11Z

@drcrallen yes

fjy · 2016-02-27T01:32:13Z

@pjain1 merge conflicts

pjain1 · 2016-02-29T17:06:47Z

@fjy I am still working on it, I will resolve the conflicts when it is reviewable

drcrallen · 2016-02-29T18:34:42Z

after log4j shutter downer module

pjain1 · 2016-03-07T22:24:12Z

@drcrallen @fjy @himanshug the PR is reviewable. Updated the top level comment to reflect the state of latest changes. @drcrallen using annotation based resource filtering for performing access checks.

himanshug · 2016-03-14T15:14:17Z

this and the change in getPendingTasks() are same, can you refactor them into a private helper method?

fjy · 2016-03-14T20:30:51Z

@pjain1 can we finish this up? there's merge conflicts

drcrallen · 2016-04-13T21:46:14Z

Which types break using only this call?

Looking at the getNames call on the DataSource impls here, it looks like you should be able to just use getNames

good catch, dataSource.getNames() already returns appropriate list.

pjain1 · 2016-04-21T18:00:20Z

@drcrallen I think I addressed your comments, can you have a look again ?

drcrallen · 2016-04-26T16:20:06Z

@@ -0,0 +1,115 @@
+package io.druid.indexing.overlord.http.security;


drcrallen · 2016-04-26T23:56:34Z

@pjain1 Please check out https://github.com/pjain1/druid/pull/1 and see if that works for you

pjain1 · 2016-04-27T19:42:42Z

@drcrallen fixed UTs. have a look now

drcrallen · 2016-04-28T00:59:30Z

(optional) would this be more appropriate as druid.request.auth or something else a little more descriptive?

Not sure if request should be included in the property name as the scope is much more than just requests to Druid. It can be called druid.security or something like that or can be kept as it is..

keep as is then.

drcrallen · 2016-04-28T01:07:14Z

few minor comments but looking very good overall.

pjain1 · 2016-04-28T15:22:19Z

@drcrallen btw can you please explain what it this for ?

I thought I was having trouble with threading, but it might not be right now.

Sometimes if a mocked object is accessed by multiple threads, the threads see an inconsistent rule set if the expectations are set in a different thread than the object is used. Memory barriers make the memory consistent at least to the point you pass the barrier.

You can tell you are hitting this kind of mocking race condition if you get an error like "expected 1 actual 1"

drcrallen · 2016-04-28T15:27:19Z

@pjain1 can you comment in the master comment why you opted for this route instead of going through
https://jersey.java.net/documentation/latest/security.html ?

pjain1 · 2016-04-28T17:21:06Z

@drcrallen updated the master comment with the info you asked for. Please, see if it makes sense.

drcrallen · 2016-04-28T17:28:02Z

@pjain1 yes thanks!

drcrallen · 2016-04-28T17:28:54Z

Please ping me when #2424 (comment) is resolved and I think this should be ready to go

pjain1 · 2016-04-28T21:24:12Z

@drcrallen done

drcrallen · 2016-04-28T21:33:29Z

Cool 👍

pjain1 · 2016-04-28T22:18:57Z

@drcrallen squashed the commits

- Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424

jihoonson · 2018-11-30T02:05:15Z

+    return action;
+  }
+
+  public abstract boolean isApplicable(String requestPath);


Does anyone know what this method is for? This method is used in only unit tests.

pjain1 added Feature Discuss labels Feb 9, 2016

himanshug reviewed Feb 9, 2016
View reviewed changes

himanshug added this to the 0.9.1 milestone Feb 9, 2016

drcrallen reviewed Feb 9, 2016
View reviewed changes

drcrallen reviewed Feb 29, 2016
View reviewed changes

pjain1 changed the title ~~[Discuss] [WIP] Enabling datasource level authorization in Druid~~ Enabling datasource level authorization in Druid Mar 7, 2016

pjain1 removed the Discuss label Mar 7, 2016

himanshug reviewed Mar 14, 2016
View reviewed changes

pjain1 closed this Mar 15, 2016

pjain1 reopened this Mar 15, 2016

drcrallen reviewed Apr 13, 2016
View reviewed changes

drcrallen reviewed Apr 26, 2016
View reviewed changes

drcrallen reviewed Apr 28, 2016
View reviewed changes

pjain1 reviewed Apr 28, 2016
View reviewed changes

fjy merged commit 0d745ee into apache:master Apr 28, 2016

fjy mentioned this pull request May 20, 2016

[WIP] Druid 0.9.1 Release Notes #2999

Closed

gianm mentioned this pull request Jun 1, 2016

MetadataResource: Fix handling of includeDisabled. #3042

Merged

b-slim mentioned this pull request Sep 20, 2016

Show candidate hosts for the given query #2282

Merged

pjain1 deleted the security branch May 6, 2017 16:42

clambertus unassigned drcrallen Jul 6, 2018

jihoonson reviewed Nov 30, 2018

View reviewed changes

clintropolis mentioned this pull request Nov 30, 2018

remove AbstractResourceFilter.isApplicable because it is not #6691

Merged

		@@ -0,0 +1,115 @@
		package io.druid.indexing.overlord.http.security;

Conversation

pjain1 commented Feb 9, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

drcrallen commented Feb 9, 2016

Uh oh!

pjain1 commented Feb 9, 2016

Uh oh!

himanshug commented Feb 9, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Feb 9, 2016

Uh oh!

drcrallen commented Feb 9, 2016

Uh oh!

himanshug commented Feb 9, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Feb 9, 2016

Uh oh!

pjain1 commented Feb 9, 2016

Uh oh!

drcrallen commented Feb 9, 2016

Uh oh!

drcrallen commented Feb 9, 2016

Uh oh!

himanshug commented Feb 9, 2016

Uh oh!

fjy commented Feb 27, 2016

Uh oh!

pjain1 commented Feb 29, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pjain1 commented Mar 7, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fjy commented Mar 14, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pjain1 commented Apr 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Apr 26, 2016

Uh oh!

pjain1 commented Apr 27, 2016

Uh oh!

drcrallen Apr 28, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Apr 28, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen Apr 28, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

pjain1 commented Feb 9, 2016 •

edited

Loading

drcrallen Apr 28, 2016 •

edited

Loading

drcrallen Apr 28, 2016 •

edited

Loading