Motivation
Currently Druid authorization is handled by Authorizer implementations checking if a Resource, which is composed of a type, name, and action can be authorized given some authentication context information. For Druid datasources, we have ResourceType.DATASOURCE, and the name is the datasource name. Additionally, #10812 added ResourceType.VIEW to allow SQL view implementations to also be authorized.
However, Druid SQL also has additional schemas, sys, lookup, INFORMATION_SCHEMA, and extensions can define any number of schemas. Adding a new ResourceType for each schema is not a scalable solution.
Proposed changes
Resource will be extended with a nullable schema string property to capture the schema that a name belongs to. For SQL query authentication, we already have access to the schema because during SQL validation the table identifiers are fully qualified, we are just discarding the schema information when constructing the Resource. For native JSON query authentication, everything with a non-druid schema can be inferred from the type of datasource (e.g. lookups) and set accordingly. Query paths will be lightly modified to include datasource schema information for all schemas when authorizing resources, instead of just druid and view schemas which are current checked. The possible exception to this is perhaps INFORMATION_SCHEMA, which I think still makes sense to be always accessible and filtered on authorization to the other schemas and tables it provides information on due to its special role in providing instructions on how to construct queries.
Authorizer implementations will need to be updated to consider this parameter. Druid just supplies the Resource to validate, so no new global setting will be necessary to enable this behavior, since it will be up to any specific Authorizer implementation to act on this additional information. Additionally, a new global configuration parameter will be added, druid.auth.enableResourceSchemaAuthroization will also be added to make authorization of the newly authorized with DATASOURCE resources such as system tables and lookups opt-in. I think this should default to true, but can be disabled for existing clusters that do not have role permissions that easily map to include access to resources which they were allowed to view prior to this change, until the permissions can be migrated to include the newly authorized resources.
For druid-basic-security, this means BasicAuthorizerPermission will be updated to allow specifying a schema matching pattern, and a new configuration parameter will be added to set the default schema parameter for permissions which are defined with a null schema (which will include all existing permissions), druid.auth.authorizer.{authorizerName}.defaultResourceSchemaPattern, which should default to druid to match the existing behavior.
I am less familiar with druid-ranger-security, but i think it would make the most sense to add an additional property to the request with the schema value of the Resource along with an extension configuration to control whether or not this schema is included on the request or not.
This does not conflict with or contradict #9380 as far as I can tell, though having a separate LOOKUP resource type seems perhaps no longer necessary, as this modification can also deprecate the VIEW resource type.
Rationale
I considered if there was a way of just capturing the schema in the name property of the Resource, however, there is not a very good upgrade/downgrade path, because authorizer implementations such as druid-basic-security use pattern matching against the name. Having a separate schema property avoids encoding this information in the name in a backwards incompatible way, and allows authorizer extensions to handle the schema property in the most appropriate way, unambiguously.
Operational impact
Cluster operators will gain additional means to secure datasources, including queries on lookup tables and system tables. Operators will however need to be considerate of their authorizer implementation and plan upgrades accordingly.
Test plan (optional)
It will be important to focus on testing upgrade and downgrade paths to ensure that authorization is able to function correctly using configurations of both new and existing behavior models.
Motivation
Currently Druid authorization is handled by
Authorizerimplementations checking if aResource, which is composed of a type, name, and action can be authorized given some authentication context information. For Druid datasources, we haveResourceType.DATASOURCE, and the name is the datasource name. Additionally, #10812 addedResourceType.VIEWto allow SQL view implementations to also be authorized.However, Druid SQL also has additional schemas,
sys,lookup,INFORMATION_SCHEMA, and extensions can define any number of schemas. Adding a newResourceTypefor each schema is not a scalable solution.Proposed changes
Resourcewill be extended with a nullableschemastring property to capture theschemathat anamebelongs to. For SQL query authentication, we already have access to the schema because during SQL validation the table identifiers are fully qualified, we are just discarding the schema information when constructing theResource. For native JSON query authentication, everything with a non-druid schema can be inferred from the type of datasource (e.g. lookups) and set accordingly. Query paths will be lightly modified to include datasource schema information for all schemas when authorizing resources, instead of justdruidandviewschemas which are current checked. The possible exception to this is perhapsINFORMATION_SCHEMA, which I think still makes sense to be always accessible and filtered on authorization to the other schemas and tables it provides information on due to its special role in providing instructions on how to construct queries.Authorizer implementations will need to be updated to consider this parameter.
Druid just supplies theAdditionally, a new global configuration parameter will be added,Resourceto validate, so no new global setting will be necessary to enable this behavior, since it will be up to any specificAuthorizerimplementation to act on this additional information.druid.auth.enableResourceSchemaAuthroizationwill also be added to make authorization of the newly authorized with DATASOURCE resources such as system tables and lookups opt-in. I think this should default to true, but can be disabled for existing clusters that do not have role permissions that easily map to include access to resources which they were allowed to view prior to this change, until the permissions can be migrated to include the newly authorized resources.For
druid-basic-security, this meansBasicAuthorizerPermissionwill be updated to allow specifying a schema matching pattern, and a new configuration parameter will be added to set the default schema parameter for permissions which are defined with a null schema (which will include all existing permissions),druid.auth.authorizer.{authorizerName}.defaultResourceSchemaPattern, which should default todruidto match the existing behavior.I am less familiar with
druid-ranger-security, but i think it would make the most sense to add an additional property to the request with theschemavalue of theResourcealong with an extension configuration to control whether or not this schema is included on the request or not.This does not conflict with or contradict #9380 as far as I can tell, though having a separate
LOOKUPresource type seems perhaps no longer necessary, as this modification can also deprecate theVIEWresource type.Rationale
I considered if there was a way of just capturing the schema in the
nameproperty of theResource, however, there is not a very good upgrade/downgrade path, because authorizer implementations such asdruid-basic-securityuse pattern matching against the name. Having a separateschemaproperty avoids encoding this information in thenamein a backwards incompatible way, and allows authorizer extensions to handle theschemaproperty in the most appropriate way, unambiguously.Operational impact
Cluster operators will gain additional means to secure datasources, including queries on lookup tables and system tables. Operators will however need to be considerate of their authorizer implementation and plan upgrades accordingly.
Test plan (optional)
It will be important to focus on testing upgrade and downgrade paths to ensure that authorization is able to function correctly using configurations of both new and existing behavior models.