(alternative proposal to #11675, thinking out loud about if there is another/better way)
Motivation
Currently Druid authorization is handled by Authorizer implementations checking if a Resource, which is composed of a type, name, and action can be authorized given some authentication context information. For Druid datasources, we have ResourceType.DATASOURCE, and the name is the datasource name. Additionally, #10812 added ResourceType.VIEW to allow SQL view implementations to also be authorized. The system and lookup schemas are not directly authorized at all, though the system schema is filtered by other permissions such as DATASOURCE:READ for tables and STATE:READ for servers.
However, Druid SQL also has additional schemas, sys, lookup, INFORMATION_SCHEMA, and extensions can define any number of schemas. Adding a new ResourceType for each schema doesn't feel like a scalable solution.
Proposed changes
A new ResourceType will be added, SQL_TABLE, which will be used exclusively to authorize SQL queries. By convention, this ResourceType will be expected to authorize fully qualified SQL table names. We already have access to fully qualified table identifiers during SQL validation, we are just discarding the schema information when constructing the Resource.
SqlResourceCollectorShuttle will be modified to collect two alternate sets of Resource to use for authorization instead of the single set it collects today. The first set will be identical to what we currently collect, composed of DATASOURCE and VIEW permissions. The other set will be composed entirely of SQL_TABLE typed permissions. Authorization in SqlLifecycle.validateAndAuthorize will be modified to make two attempts to authorize the query, the first using the set of SQL_TABLE permissions, and if that fails, it will make a 2nd using the set of DATASOURCE/VIEW permissions.
A new option will be added, druid.sql.planner.requireSqlTablePermissions, which will allow operators to require that all SQL queries require SQL_TABLE permissions, but will default to false. This means that all SQL schemas other than druid.* and view.* cannot be forced to be authorized unless requireSqlTablePermissions is set to true, because the fall-back authorization does not produce any Resource to validate for these other schemas. On the other hand, this new permission allows operators to only provide SQL query access if they desire, since SQL_TABLE permissions will not be considered at all for native JSON queries.
Rationale
See #11675
This is a less disruptive version to authorizer extension writers since it just adds a new ResourceType, and has a lower test burden.
This version of the proposal doesn't currently solve:
Adding a new ResourceType for each schema doesn't feel like a scalable solution.
for native JSON queries, as authorization here is unchanged. We could probably consider using SQL_TABLE for native query authorization by inferring the schema by the type of Datasource, but it is currently out of scope of this proposal.
It is also possible that the native query set of resource types might also be much smaller potentially than what is possible in SQL schemas, so maybe there just aren't that many possible different resource types for native queries and we should make a separate ResourceType for each (e.g. LOOKUP for lookups, etc).
Operational impact
This has lower operational impact than #11675, as there are no upgrade concerns, and the only downgrade concern is that if operators configure users with SQL_TABLE permissions, those users will not be able to be authorized in older versions of Druid unless separately granted DATASOURCE and/or VIEW permissions.
Test plan
Testing scope is much more limited than #11675 as authorizer implementations do not need to change at all. SQL_TABLE will need to be well documented that the name of the Resource will be the fully qualified table name, e.g. a Druid datasource named wikipedia will authorize a resource with the name druid.wikipedia, likewise sys.segments will be authorized as exactly that. Testing can then focus primarily on unit and functional tests to simply prove the functionality of the two operation modes (druid.sql.planner.requireSqlTablePermissions set to true and false)
(alternative proposal to #11675, thinking out loud about if there is another/better way)
Motivation
Currently Druid authorization is handled by
Authorizerimplementations checking if aResource, which is composed of a type, name, and action can be authorized given some authentication context information. For Druid datasources, we haveResourceType.DATASOURCE, and the name is the datasource name. Additionally, #10812 addedResourceType.VIEWto allow SQL view implementations to also be authorized. The system and lookup schemas are not directly authorized at all, though the system schema is filtered by other permissions such asDATASOURCE:READfor tables andSTATE:READfor servers.However, Druid SQL also has additional schemas, sys, lookup, INFORMATION_SCHEMA, and extensions can define any number of schemas. Adding a new
ResourceTypefor each schema doesn't feel like a scalable solution.Proposed changes
A new
ResourceTypewill be added,SQL_TABLE, which will be used exclusively to authorize SQL queries. By convention, this ResourceType will be expected to authorize fully qualified SQL table names. We already have access to fully qualified table identifiers during SQL validation, we are just discarding the schema information when constructing theResource.SqlResourceCollectorShuttlewill be modified to collect two alternate sets ofResourceto use for authorization instead of the single set it collects today. The first set will be identical to what we currently collect, composed ofDATASOURCEandVIEWpermissions. The other set will be composed entirely ofSQL_TABLEtyped permissions. Authorization inSqlLifecycle.validateAndAuthorizewill be modified to make two attempts to authorize the query, the first using the set ofSQL_TABLEpermissions, and if that fails, it will make a 2nd using the set ofDATASOURCE/VIEWpermissions.A new option will be added,
druid.sql.planner.requireSqlTablePermissions, which will allow operators to require that all SQL queries requireSQL_TABLEpermissions, but will default to false. This means that all SQL schemas other thandruid.*andview.*cannot be forced to be authorized unlessrequireSqlTablePermissionsis set to true, because the fall-back authorization does not produce anyResourceto validate for these other schemas. On the other hand, this new permission allows operators to only provide SQL query access if they desire, sinceSQL_TABLEpermissions will not be considered at all for native JSON queries.Rationale
See #11675
This is a less disruptive version to authorizer extension writers since it just adds a new
ResourceType, and has a lower test burden.This version of the proposal doesn't currently solve:
for native JSON queries, as authorization here is unchanged. We could probably consider using
SQL_TABLEfor native query authorization by inferring the schema by the type ofDatasource, but it is currently out of scope of this proposal.It is also possible that the native query set of resource types might also be much smaller potentially than what is possible in SQL schemas, so maybe there just aren't that many possible different resource types for native queries and we should make a separate
ResourceTypefor each (e.g.LOOKUPfor lookups, etc).Operational impact
This has lower operational impact than #11675, as there are no upgrade concerns, and the only downgrade concern is that if operators configure users with
SQL_TABLEpermissions, those users will not be able to be authorized in older versions of Druid unless separately grantedDATASOURCEand/orVIEWpermissions.Test plan
Testing scope is much more limited than #11675 as authorizer implementations do not need to change at all.
SQL_TABLEwill need to be well documented that thenameof theResourcewill be the fully qualified table name, e.g. a Druid datasource namedwikipediawill authorize a resource with the namedruid.wikipedia, likewisesys.segmentswill be authorized as exactly that. Testing can then focus primarily on unit and functional tests to simply prove the functionality of the two operation modes (druid.sql.planner.requireSqlTablePermissionsset to true and false)