Redesign QueryContext class#13071
Merged
abhishekagarwal87 merged 10 commits intoapache:masterfrom Oct 15, 2022
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Query context revision which builds on the work started in PR #13022.
Release Notes
Most of the work in this PR is transparent to users. This PR includes Issue #13120 which introduces a user-visible change. We introduce two new configuration keys that refine the query context security model controlled by
druid.auth.authorizeQueryContextParams. When that value is set totruethen two other configuration options become available:druid.auth.unsecuredContextKeys: The set of query context keys that do not require a security check. Use this for the "white-list" of key to allow. All other keys go through the existing context key security checks.druid.auth.securedContextKeys: The set of query context keys that do require a security check. Use this when you want to allow all but a specific set of keys: only these keys go through the existing context key security checks.Both are set using JSON list format:
You generally set one or the other values. If both are set,
unsecuredContextKeysacts as exceptions tosecuredContextKeys.In addition, Druid defines two query context keys which always bypass checks because Druid uses them internally:
sqlQueryIdsqlStringifyArraysBackward Compatibility
When upgrading Druid, if query context security is disabled (
druid.auth.authorizeQueryContextParams=false, the default) then you will see no change.If query context security is enabled (
druid.auth.authorizeQueryContextParams=true), then the two keys listed above will always be allowed, regardless of any security rules that may have been set. You can remove these two keys from your rules if you have added them.The two new config values (
unsecuredContextKeys, andsecuredContextKeys) will not be set in your configuration (or in the default configurations) and will thus behavior after an upgrade will not change. You can, however, shift context key security rules you have set for all users into the newunsecuredContextKeysconfig property.Background
Recent work uncovered two issues with the way Druid handles query contexts:
QueryContext#13022 fixes type-unsafe value conversionsDuring the investigation of the above it became clear that the semantics around the query context are a bit unclear and could be improved.
In particular, the current
QueryContextclass promises more than it can deliver. It groups context values into three groups: default, system values and user values. Splitting up the values was done to allow authorizing just the user-provided keys. However, this didn't quite work as intended for several reasons. First, most code does not recognize this format, and instead "merges" all three maps before making additional changes. Second, the only use of the user-provided values is to authorize them, something that can be done more simply via other means. Third, the concept of "user-set" keys is incorrect: keys set may come from the user, or may come from an application or the Router.As a result, the
QueryContextclass creates confusion rather than tidying things up as was intended. The tidying up requires that we pull in Issue #13120 to resolve a context key security ambiguity.Overview of Revisions
This PR achieves the desired goals in a simpler way, while unifying the many ways that the code currently works with the query context.
QueryContextis replaced by a "facade" on top of a single map.QueryContextwas created to allow authorizing user context keys. This PR solves that issue by making a copy of the keys separate from the context, allowing the SQL engine to modify the context without creating conflicts with the security feature.QueryContextto fetch values are replaced with a reference the newQueryContextclass as a facade.QueryContextclass since there is no real value in creating a layer on top of the map.QueryContext#13022 adds a number of type-safe method to theQueryinterface. However, doing so clutters up that interface, and we still need theQueryContextsmethods to get at specific context values. See the discussion in Avoid ClassCastException when getting values fromQueryContext#13022 for details. This PR, by contrast, puts all context access methods on the revisedQueryContextfacade, making for cleaner code.QueryContext#13022. They are replaced again here by the newQueryContexttype-safe methods.QueryContextswhich take aQueryor context map parameter are replaced by methods on the revisedQueryContextclass.QueryContextsmethods which rewrote a query moved to theQueriesclass.BadQueryContextExceptionwith a clear explanation of the problem. Previously, we threw bothBadQueryContextExceptionandISEfor such errors, and the errors have variations in format and wording.getmethods are standardized.get<Type>()without a default returns a boxed value, ornullif the value is not set. This allows code to implement logic of the form "if the value is set, do something with the value, else do something else."get(defaultValue)methods return an unboxed value, since they will return the default if the value is unset.DruidPlanneris refactored a bit to better handle context key security.AuthConfigimplements the enhanced context key security model.Specifics
The original
QueryContextclass was first removed: all references to that class were modified to refer to the context map instead.A new class of the same name was created. The only creator of the new class is
Query.queryContext(). The new class holds onto the context map associated with the query, and provides the many type-safe and value-specific methods. SinceQueryContextis simply a facade, it is cheap to create an instance as needed. This lets us convert code of the form:To
When a bit of code makes multiple references to context values (as above), we can also do:
The following methods in
Queryare marked deprecated:Although the above methods are deprecated, they are reimplemented as
defaultmethods. As a result, the methods of the same names are removed from subclasses since they are redundant.The
Query.getQueryContext()method is removed as a way of marking that the oldQueryContextis no longer available. The old method didn't really even work: the different context types were lost when merging into a single map. A newqueryContext()method returns the facade version.Alternatives
The discussions in #13049 and #13022 suggested other alternatives:
QueryContextsmethods as shown above. This works fine, and is one of the steps taken in Fix QueryContext race condition #13049 that lead to this PR, but is a bit cumbersome.Querymethod as in an early draft of Avoid ClassCastException when getting values fromQueryContext#13022. Doing so clutters theQueryinterface, and does not help in the SQL layer where we work with contexts before the nativeQueryis created. Unless we really want to clutter theQueryinterface, we'd still need the value-specific methods inQueryContexts.Conclusion
The result is a solution that:
QueryContextQueryContextby avoiding multiple maps.QueryContext#13022 by shifting the methods toQueryContext.This PR has: