Skip to content

Redesign QueryContext class#13071

Merged
abhishekagarwal87 merged 10 commits intoapache:masterfrom
paul-rogers:220911-context
Oct 15, 2022
Merged

Redesign QueryContext class#13071
abhishekagarwal87 merged 10 commits intoapache:masterfrom
paul-rogers:220911-context

Conversation

@paul-rogers
Copy link
Copy Markdown
Contributor

@paul-rogers paul-rogers commented Sep 11, 2022

Query context revision which builds on the work started in PR #13022.

Release Notes

Most of the work in this PR is transparent to users. This PR includes Issue #13120 which introduces a user-visible change. We introduce two new configuration keys that refine the query context security model controlled by druid.auth.authorizeQueryContextParams. When that value is set to true then two other configuration options become available:

  • druid.auth.unsecuredContextKeys: The set of query context keys that do not require a security check. Use this for the "white-list" of key to allow. All other keys go through the existing context key security checks.
  • druid.auth.securedContextKeys: The set of query context keys that do require a security check. Use this when you want to allow all but a specific set of keys: only these keys go through the existing context key security checks.

Both are set using JSON list format:

druid.auth.securedContextKeys=["secretKey1", "secretKey2"]

You generally set one or the other values. If both are set, unsecuredContextKeys acts as exceptions to securedContextKeys.

In addition, Druid defines two query context keys which always bypass checks because Druid uses them internally:

  • sqlQueryId
  • sqlStringifyArrays

Backward Compatibility

When upgrading Druid, if query context security is disabled (druid.auth.authorizeQueryContextParams=false, the default) then you will see no change.

If query context security is enabled (druid.auth.authorizeQueryContextParams=true), then the two keys listed above will always be allowed, regardless of any security rules that may have been set. You can remove these two keys from your rules if you have added them.

The two new config values (unsecuredContextKeys, and securedContextKeys) will not be set in your configuration (or in the default configurations) and will thus behavior after an upgrade will not change. You can, however, shift context key security rules you have set for all users into the new unsecuredContextKeys config property.

Background

Recent work uncovered two issues with the way Druid handles query contexts:

During the investigation of the above it became clear that the semantics around the query context are a bit unclear and could be improved.

In particular, the current QueryContext class promises more than it can deliver. It groups context values into three groups: default, system values and user values. Splitting up the values was done to allow authorizing just the user-provided keys. However, this didn't quite work as intended for several reasons. First, most code does not recognize this format, and instead "merges" all three maps before making additional changes. Second, the only use of the user-provided values is to authorize them, something that can be done more simply via other means. Third, the concept of "user-set" keys is incorrect: keys set may come from the user, or may come from an application or the Router.

As a result, the QueryContext class creates confusion rather than tidying things up as was intended. The tidying up requires that we pull in Issue #13120 to resolve a context key security ambiguity.

Overview of Revisions

This PR achieves the desired goals in a simpler way, while unifying the many ways that the code currently works with the query context.

  • The original three-map QueryContext is replaced by a "facade" on top of a single map.
  • The original QueryContext was created to allow authorizing user context keys. This PR solves that issue by making a copy of the keys separate from the context, allowing the SQL engine to modify the context without creating conflicts with the security feature.
  • Uses of QueryContext to fetch values are replaced with a reference the new QueryContext class as a facade.
  • Modification to the context in the planning layer occurs directly on the context map; it is no longer done via the old QueryContext class since there is no real value in creating a layer on top of the map.
  • Avoid ClassCastException when getting values from QueryContext #13022 adds a number of type-safe method to the Query interface. However, doing so clutters up that interface, and we still need the QueryContexts methods to get at specific context values. See the discussion in Avoid ClassCastException when getting values from QueryContext #13022 for details. This PR, by contrast, puts all context access methods on the revised QueryContext facade, making for cleaner code.
  • All type-unsafe context value accesses were replaced in Avoid ClassCastException when getting values from QueryContext #13022. They are replaced again here by the new QueryContext type-safe methods.
  • The old type-unsafe methods are marked deprecated. No code in Druid-proper references them, though we must assume that extensions may continue to use them.
  • All specific methods in QueryContexts which take a Query or context map parameter are replaced by methods on the revised QueryContext class.
  • QueryContexts methods which rewrote a query moved to the Queries class.
  • Validation of a context value now reliably throws a BadQueryContextException with a clear explanation of the problem. Previously, we threw both BadQueryContextException and ISE for such errors, and the errors have variations in format and wording.
  • get methods are standardized. get<Type>() without a default returns a boxed value, or null if the value is not set. This allows code to implement logic of the form "if the value is set, do something with the value, else do something else."
  • get(defaultValue) methods return an unboxed value, since they will return the default if the value is unset.
  • DruidPlanner is refactored a bit to better handle context key security.
  • AuthConfig implements the enhanced context key security model.

Specifics

The original QueryContext class was first removed: all references to that class were modified to refer to the context map instead.

A new class of the same name was created. The only creator of the new class is Query.queryContext(). The new class holds onto the context map associated with the query, and provides the many type-safe and value-specific methods. Since QueryContext is simply a facade, it is cheap to create an instance as needed. This lets us convert code of the form:

long foo = QueryContexts.getAsLong(query, FOO_NAME, FOO_DEFAULT);
long bar = QueryContexts.getBar(query);

To

long foo = query.queryContext().getLong(FOO_NAME, FOO_DEFAULT);
long bar = query.queryContext().getBar();

When a bit of code makes multiple references to context values (as above), we can also do:

QueryContext queryContext = query.queryContext();
long foo = queryContext.getLong(FOO_NAME, FOO_DEFAULT);
long bar = queryContext.getBar();

The following methods in Query are marked deprecated:

interface Query... {
 @Deprecated
  <ContextType> ContextType getContextValue(String key);
  @Deprecated
  <ContextType> ContextType getContextValue(String key, ContextType defaultValue);
  @Deprecated
  default boolean getContextBoolean(String key, boolean defaultValue);

Although the above methods are deprecated, they are reimplemented as default methods. As a result, the methods of the same names are removed from subclasses since they are redundant.

The Query.getQueryContext() method is removed as a way of marking that the old QueryContext is no longer available. The old method didn't really even work: the different context types were lost when merging into a single map. A new queryContext() method returns the facade version.

Alternatives

The discussions in #13049 and #13022 suggested other alternatives:

  • Work only with the context map and the QueryContexts methods as shown above. This works fine, and is one of the steps taken in Fix QueryContext race condition #13049 that lead to this PR, but is a bit cumbersome.
  • Add methods to the Query method as in an early draft of Avoid ClassCastException when getting values from QueryContext #13022. Doing so clutters the Query interface, and does not help in the SQL layer where we work with contexts before the native Query is created. Unless we really want to clutter the Query interface, we'd still need the value-specific methods in QueryContexts.
  • Leave well enough alone: status quo. While the existing code works, as noted above, the semantics are a bit messy which creates confusion for readers. Tidying up the code has no benefit to the computer, but is helpful for us poor humans who have to understand the code.

Conclusion

The result is a solution that:


This PR has:

  • been self-reviewed.
  • has been validated using existing tests.
  • has modified unit tests as needed to reflect changed methods, exceptions and semantics.
  • been tested in a test Druid cluster.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants