Enabling datasource level authorization in Druid#2424
Enabling datasource level authorization in Druid#2424fjy merged 1 commit intoapache:masterfrom pjain1:security
Conversation
|
There is an interesting mix of standardized and non standardized auth methods here. On one hand the authorization tries to provide a standard resource related framework, but on the other hand it completely relies on the endpoint to do all the auth requesting and logic. I suspect this is because the resource of interest is contained within the body rather than as part of the http request metadata. IMHO a "cleaner" solution would be to architect the requests such that any endpoint that touches a sensitive resource must have the appropriate annotations on what resources it touches. Then the auth layer transparently handles the auth based on request metadata, and the duty within the endpoint itself is simply to wire up the authorized resource to the resource actually being used. Another way to say it is that it would be awesome if by the time the method is called, the auth system has already performed its function, and it is simply the job of the endpoint method to ensure compliance with the auth system's expectations on resource usage. |
|
Yes the auth checks are performed inside endpoints as many of the Druid endpoints does not follow REST conventions and resource information is inside the body. |
|
@drcrallen your suggestion is valid, however something that can be done independently of this PR and would probably require druid client API changes. |
There was a problem hiding this comment.
can we just call it Access ?
There was a problem hiding this comment.
why does this need to be enum?
There was a problem hiding this comment.
@drcrallen i think, the set of valid resource types would be fixed by druid-core because druid-core calls in to the action and provides those as arguments, so it makes sense to use enum to represent resource types.
in any case, since we are calling the api experimental and potentially changeable in near future. I wouldn't worry too much about it either ways.
|
@himanshug / @pjain1 would you guys feel comfortable calling auth an experimental feature that may change significantly in the near-ish future? If that's the case then a stop-gap that solves your main pain points should be ok as long as its not intrusive in other scenarios (which I don't think this PR is) |
|
General PR comments:
|
|
@drcrallen I'm fine calling it experimental. |
There was a problem hiding this comment.
why not just return boolean?
|
@pjain1 please comment in the code where you're adding the auth checks that auth is experimental and reference this PR. ((so future developers know why it is there)) |
|
@drcrallen sure I will put the necessary comments. I will put the headers. |
|
@pjain1 good points. My reasons for not favoring enums are:
So basically, if you're trying to force any extensions to ONLY use the values presented, and not intending the security set to be extendable, then there can be a good argument for enums. |
|
@himanshug I think from #2424 (comment) you're supporting B) from my list above, where you are purposefully limiting the options available. Is that correct? |
|
@drcrallen yes |
|
@pjain1 merge conflicts |
|
@fjy I am still working on it, I will resolve the conflicts when it is reviewable |
There was a problem hiding this comment.
after log4j shutter downer module
|
@drcrallen @fjy @himanshug the PR is reviewable. Updated the top level comment to reflect the state of latest changes. @drcrallen using annotation based resource filtering for performing access checks. |
There was a problem hiding this comment.
this and the change in getPendingTasks() are same, can you refactor them into a private helper method?
|
@pjain1 can we finish this up? there's merge conflicts |
There was a problem hiding this comment.
Which types break using only this call?
There was a problem hiding this comment.
Looking at the getNames call on the DataSource impls here, it looks like you should be able to just use getNames
There was a problem hiding this comment.
good catch, dataSource.getNames() already returns appropriate list.
|
@drcrallen I think I addressed your comments, can you have a look again ? |
| @@ -0,0 +1,115 @@ | |||
| package io.druid.indexing.overlord.http.security; | |||
|
@pjain1 Please check out https://github.com/pjain1/druid/pull/1 and see if that works for you |
|
@drcrallen fixed UTs. have a look now |
There was a problem hiding this comment.
(optional) would this be more appropriate as druid.request.auth or something else a little more descriptive?
There was a problem hiding this comment.
Not sure if request should be included in the property name as the scope is much more than just requests to Druid. It can be called druid.security or something like that or can be kept as it is..
|
few minor comments but looking very good overall. |
There was a problem hiding this comment.
@drcrallen btw can you please explain what it this for ?
There was a problem hiding this comment.
I thought I was having trouble with threading, but it might not be right now.
Sometimes if a mocked object is accessed by multiple threads, the threads see an inconsistent rule set if the expectations are set in a different thread than the object is used. Memory barriers make the memory consistent at least to the point you pass the barrier.
You can tell you are hitting this kind of mocking race condition if you get an error like "expected 1 actual 1"
|
@pjain1 can you comment in the master comment why you opted for this route instead of going through |
|
@drcrallen updated the master comment with the info you asked for. Please, see if it makes sense. |
|
@pjain1 yes thanks! |
|
Please ping me when #2424 (comment) is resolved and I think this should be ready to go |
|
@drcrallen done |
|
Cool 👍 |
|
@drcrallen squashed the commits |
- Introduce `AuthorizationInfo` interface, specific implementations of which would be provided by extensions - If the `druid.auth.enabled` is set to `true` then the `isAuthorized` method of `AuthorizationInfo` will be called to perform authorization checks - `AuthorizationInfo` object will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name as `AuthConfig.DRUID_AUTH_TOKEN` - As per the scope of this PR, all resources that needs to be secured are divided into 3 types - `DATASOURCE`, `CONFIG` and `STATE`. For any type of resource, possible actions are - `READ` or `WRITE` - Specific ResourceFilters are used to perform auth checks for all endpoints that corresponds to a specific resource type. This prevents duplication of logic and need to inject HttpServletRequest inside each endpoint. For example - `DatasourceResourceFilter` is used for endpoints where the datasource information is present after "datasources" segment in the request Path such as `/druid/coordinator/v1/datasources/`, `/druid/coordinator/v1/metadata/datasources/`, `/druid/v2/datasources/` - `RulesResourceFilter` is used where the datasource information is present after "rules" segment in the request Path such as `/druid/coordinator/v1/rules/` - `TaskResourceFilter` is used for endpoints is used where the datasource information is present after "task" segment in the request Path such as `druid/indexer/v1/task` - `ConfigResourceFilter` is used for endpoints like `/druid/coordinator/v1/config`, `/druid/indexer/v1/worker`, `/druid/worker/v1` etc - `StateResourceFilter` is used for endpoints like `/druid/broker/v1/loadstatus`, `/druid/coordinator/v1/leader`, `/druid/coordinator/v1/loadqueue`, `/druid/coordinator/v1/rules` etc - For endpoints where a list of resources is returned like `/druid/coordinator/v1/datasources`, `/druid/indexer/v1/completeTasks` etc. the list is filtered to return only the resources to which the requested user has access. In these cases, `HttpServletRequest` instance needs to be injected in the endpoint method. Note - JAX-RS specification provides an interface called `SecurityContext`. However, we did not use this but provided our own interface `AuthorizationInfo` mainly because it provides more flexibility. For example, `SecurityContext` has a method called `isUserInRole(String role)` which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources. Fixes #2355 with PR #2424
| return action; | ||
| } | ||
|
|
||
| public abstract boolean isApplicable(String requestPath); |
There was a problem hiding this comment.
Does anyone know what this method is for? This method is used in only unit tests.
Fixes #2355. This PR is meant to put necessary abstractions inside Druid for enabling authorization as discussed in #2355
AuthorizationInfointerface, specific implementations of which would be provided by extensionsdruid.auth.enabledis set totruethen theisAuthorizedmethod ofAuthorizationInfowill be called to perform authorization checksAuthorizationInfoobject will be created in the servlet filters of specific extension and will be passed as a request attribute with attribute name asAuthConfig.DRUID_AUTH_TOKENDATASOURCE,CONFIGandSTATE. For any type of resource, possible actions are -READorWRITEDatasourceResourceFilteris used for endpoints where the datasource information is present after "datasources" segment in the request Path such as/druid/coordinator/v1/datasources/,/druid/coordinator/v1/metadata/datasources/,/druid/v2/datasources/RulesResourceFilteris used where the datasource information is present after "rules" segment in the request Path such as/druid/coordinator/v1/rules/TaskResourceFilteris used for endpoints is used where the datasource information is present after "task" segment in the request Path such asdruid/indexer/v1/taskConfigResourceFilteris used for endpoints like/druid/coordinator/v1/config,/druid/indexer/v1/worker,/druid/worker/v1etcStateResourceFilteris used for endpoints like/druid/broker/v1/loadstatus,/druid/coordinator/v1/leader,/druid/coordinator/v1/loadqueue,/druid/coordinator/v1/rulesetc/druid/coordinator/v1/datasources,/druid/indexer/v1/completeTasksetc. the list is filtered to return only the resources to which the requested user has access. In these cases,HttpServletRequestinstance needs to be injected in the endpoint method.Note -
JAX-RS specification provides an interface called
SecurityContext. However, we did not use this but provided our own interfaceAuthorizationInfomainly because it provides more flexibility. For example,SecurityContexthas a method calledisUserInRole(String role)which would be used for auth checks and if used then the mapping of what roles can access what resource needs to be modeled inside Druid either using some convention or some other means which is not very flexible as Druid has dynamic resources like datasources.