Avoid ClassCastException when getting values from `QueryContext` by FrankChen021 · Pull Request #13022 · apache/druid

FrankChen021 · 2022-09-03T04:30:29Z

The problem was first reported by #12760 , and #12833 solves that specific issue.
This PR solves all potential problems by providing a group of type safe methods.

Description

QueryContext holds values passed from user side in JSON format.
And currently Druid requires values in JSON must be strictly type matched.
For example, a property maxOnDiskStorage in Druid is defined as integer, it must be in the format as

"context": {"maxOnDiskStorage": 100}

If the number 100 is serialized as string as

"context": {"maxOnDiskStorage": "100"}

Druid throws a ClassCastException to reject the input query.

Actually, string-formatted number "100" can be parsed as number at Druid server side.

Key changes

#12833 adds a method getContextHumanReadableBytes to get context value from any acceptable format safely. And this PR adds a group of methods to do so and changes all existing code reference to the new APIs.

Newly added methods are:

getContextAsInt
getContextAsLong
getContextAsFloat
getContextAsString
getContextAsEnum
getContextAsBoolean

Original unsafe method getContextValue is still kept to allow us get objects which are stored by Druid server side temporarily.
Some javadoc is added to tell callers know which methods should be used instead.

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

lgtm-com · 2022-09-05T03:33:51Z

This pull request introduces 1 alert when merging 93db082 into 6805a7f - view on LGTM.com

new alerts:

1 for Missing format argument

FrankChen021 · 2022-09-07T13:43:39Z

@clintropolis Could you review this?

clintropolis

seems reasonable to me 👍

clintropolis · 2022-09-07T18:52:01Z

    return null;
  }

-  <ContextType> ContextType getContextValue(String key);


hmm, actually I guess this is an @ExtensionPoint, we should probably leave the old methods here to not be backwards incompatible

Let me revert this.

clintropolis · 2022-09-07T18:53:08Z

  {
  }

-  static <T, E extends Enum<E>> E parseEnum(Query<T> query, String key, Class<E> clazz, E defaultValue)


this class is also marked @PublicApi and so we should ideally leave the signatures in place

But this method is not declared as public. Do we need to keep it?

paul-rogers

@FrankChen021, thanks for helping clean up this area! It is in much need of tidying up. A few comments for your consideration.

paul-rogers · 2022-09-07T20:04:12Z

+
+  default int getContextAsInt(String key, int defaultValue)
+  {
+    if (getQueryContext() != null) {


Nit: use == and revers the then and else clauses. Makes the code just a bit easier to read.

paul-rogers · 2022-09-07T20:05:15Z

+  }
+
+  @Nullable
+  default Boolean getContextAsBoolean(String key)


Nit: use the same name for both "flavors" of methods: easier to remember.

I think you mean this method getContextBoolean?
I didn't change this method to use the same name pattern as 'getContextAs...' because there're lots reference to this method which might cause this PR so large. But indeed, it should be the same name pattern.

paul-rogers · 2022-09-07T20:06:09Z

+  }
+
+  @Nullable
+  public Boolean getAsBoolean(String parameter)


Nit: best to not call query context values "parameters" since a query can have "parameters" which are not in the context. Suggestion: "key".

paul-rogers · 2022-09-07T20:08:38Z

    return QueryContexts.getAsBoolean(parameter, get(parameter), defaultValue);
  }

+  public Integer getAsInt(final String parameter)


I wonder, since we have methods for all of these on QueryContexts, and we use QueryContexts to get each specific value type, should we just omit these methods and standard on always using QueryContexts to get context values? Thoughts?

paul-rogers · 2022-09-07T20:09:24Z

  public static <T> long getMaxScatterGatherBytes(Query<T> query)
  {
-    return parseLong(query, MAX_SCATTER_GATHER_BYTES_KEY, Long.MAX_VALUE);
+    return query.getContextAsLong(MAX_SCATTER_GATHER_BYTES_KEY, Long.MAX_VALUE);


Thanks for cleaning this up! I'd wondered why we have two different ways to get values.

after this PR, only getContextAsxxxx series methods are used to get values.

paul-rogers · 2022-09-07T20:20:09Z

+  @Nullable
+  public static Boolean getAsBoolean(
+      final String parameter,
+      final Object value


This set of methods has gotten a bit muddled.

This version takes the name of a context key (not "parameter") just so it can throw an error. But, there are places where we (or at least I) want to use this method where the map is not a context.

There are versions that take a map and a default, and versions, like this, that take a value (not parameter) and convert it.

Maybe have the following set:

// Check if a key exists for code that wants to do something special for this case public boolean hasValue(Map<..> context, String key) ... // Generic method to convert objects to boolean public boolean parseBoolean(Object value) { // if can't parse, throw an exception } // Specific method for boolean context values public boolean getBoolean(Map<...> context, String key, boolean defaultValue) { value = context.get(key); if (value == null) { return defaultValue; } try { return parseBoolean(value); } catch (Exception e) { // Throw the "Expected context value [%s] to be a ... exception } }

A parseBoolean is still kept at line 474 in this class to allow to parse a value from a map object.

paul-rogers · 2022-09-07T20:20:47Z

    }
+
+    throw new ISE(
+        "Expected parameter [%s] must be type of [%s], actual type is [%s].",


Again, context values are not "parameters".

paul-rogers · 2022-09-10T12:49:25Z

@FrankChen021, as it turns out, the QueryContext class introduced several problems. PR 13049 proposes to rip out QueryContext and go back to using a map (along with a few adjustments to allow authorizing the context.) That PR will, unfortunately, conflict with this one.

While using maps solves some problems, it requires awkward code such as:

x = QueryContexts.getSomething(context, key, default)

One advantage of the old QueryContext was that we could do:

x = queryContext.getSomething(key, default);

Here, you propose a solution: add those values to the query class. That's OK, but it makes the query class pretty complex: we have methods for many tasks, and now we add multiple methods for context values. Plus, there are places (such as the SQL layer) where we work with the context without also having a native query. So, we'd have to duplicate these methods in several places. This suggests that methods to work with the context belong on the context itself, not on the users of the context (such as Query).

One solution, which I didn't include in PR 13049, is to define a new class: a typed wrapper on top of the map. Something like:

class QueryContext {
  Object get(String key);
  String getString(String key);
  String getString(String key, String defaultValue);
  int getInt(String key);
  int getInt(String key, int defaultValue);
  ...
}

For each of our supported types: String, Int, Long, etc. We'd then store an instance of QueryContext where we currently store the map.

QueryContexts has a zillion getSomething() methods for the various parameters. The logical step would be to move those into the above QueryContext class, else we'll find ourselves either duplicating code, or still using QueryContexts methods as shown earlier.

The problem that PR 13049 tries to solve is that there are times when the context is immutable, other times when it is mutable. And, those times trade off: the user's request is immutable, a mutable copy is made while planning the query, then the context must become immutable again once we start executing in multiple threads. So, we could have two forms: the above immutable form with only "getters", and a mutable form:

class MutableQueryContext {
  Object put(String key, Object value);
  Object remove(String key);
  ...
}

To do this, we'd actually want QueryContext to be an interface, with two concrete implementations: ImmutableQueryContext and MutableQueryContext. One reason for the two classes is that the classes will clearly label the intent of when when the context is mutable or not.

This may still lead to a muddle, however. A Query will have a mutable context during the "planning" (QueryRunner-based) part of the native query execution, but immutable once we move to the execution (Sequence-based) part of the process.

There is also the issue of compatibility with custom Query classes where folks have implemented the existing interfaces using a map. It is not entirely clear how often that is done and what we can change in this area. Any ideas?

paul-rogers · 2022-09-10T13:52:47Z

Here's a follow-on idea. Modifying a context map can perhaps be left to working directly with the map, since it is mostly just calling put() along with a few other items. So, perhaps we can just focus on reading the context. Maybe we define:

interface Query<...> {
   default QueryContext getQueryContext() { return new QueryContext(getContext()); }

Then, QueryContext has all the many accessor methods mentioned above. That is, QueryContext is a transient facade on top of the map.

With this approach, all the myriad getFoo and parseFoo methods in QueryContexts can move to QueryContext. In this approach, the goal of this PR (type safety) would be provided by the typed get{Type} methods on QueryContext.

What do you think? If this will work, I can perhaps modify the PR I mentioned to include this approach.

paul-rogers · 2022-09-10T19:25:15Z

Went ahead and added the QueryContext "facade" in PR #13049 as it is a cleaner solution to the race condition issue which that PR seeks to solve.

FrankChen021 · 2022-09-11T03:20:30Z

Here's a follow-on idea. Modifying a context map can perhaps be left to working directly with the map, since it is mostly just calling put() along with a few other items. So, perhaps we can just focus on reading the context. Maybe we define:
interface Query<...> {
   default QueryContext getQueryContext() { return new QueryContext(getContext()); }
Then, QueryContext has all the many accessor methods mentioned above. That is, QueryContext is a transient facade on top of the map.

With this approach, all the myriad getFoo and parseFoo methods in QueryContexts can move to QueryContext. In this approach, the goal of this PR (type safety) would be provided by the typed get{Type} methods on QueryContext.

What do you think? If this will work, I can perhaps modify the PR I mentioned to include this approach.

Hi @paul-rogers

When I touch this part in this PR, I noticed there are two classes, QueryContext and QueryContexts, the latter one provides a group by getXXXX helpers that QueryContext can be used. And also, another group of getContextXXXX methods are defined in Query to get values from QueryContext.
As you can seen in this PR, if a new getXXX method is added, we have to add it in 3 classes to follow current pattern. These are too complicated. I didn't make changes to this part because it involves a huge refactoring.

Basically, I agree with you that there should be a facade QueryContext there(but I think QueryContext should not expose Map object to callers directly), and a series of getXXXX methods are defined in QueryContext to get values. And at last those helper motheds in Query are removed or deprecated.

My focus in this PR is on the safe type conversion at caller side. And the work has been done. How about let's merge this first and then resolve the conflicts after you stablize the core changes on your branch?

paul-rogers · 2022-09-11T10:39:53Z

@FrankChen021, your goal of type-safe conversion is a good one. The question is how to achieve that goal so that you can get this PR done. Toward that end, PR #13049 was refocused on just the QueryContext concurrency issues so that it won't conflict with the PR. A new PR #13071 was opened to present the query context refactor approach, which we can consider after your PR is merge

We can still ask the question: should we add more methods to the Query interface, or should we move those responsibilities to a new class? In this PR, we've added more getContextX methods to Query. In #13071, we've actually deprecated the existing ones in favor of the new QueryContext facade. #13071 would immediately remove the new methods you've added here as they would become redundant. So, perhaps an alternative solution is to use the existing QueryContexts methods in this PR, then switch to the facade in the next PR. For example, consider this line in this PR:

final String timestampStringFromContext = query.getContextAsString(CTX_KEY_FUDGE_TIMESTAMP, "");

With the suggested "temporary" solution, perhaps write this as:

final String timestampStringFromContext = QueryContexts.getAsString(query, CTX_KEY_FUDGE_TIMESTAMP, "");

In the later PR, this would become:

final String timestampStringFromContext = query.queryContext().getString(CTX_KEY_FUDGE_TIMESTAMP, "");

The alternative approach avoids adding new methods to the Query interface.

The thought is that we can easily add a few more type-safe methods to Query. But, the code also uses a large number of value-specific methods in QueryContexts. With the approach in this PR, we'd have two ways to get values:

final String timestampStringFromContext = query.getContextAsString(CTX_KEY_FUDGE_TIMESTAMP, "");
final boolean flag = QueryContexts.getEnableJoinFilterRewrite(query);

In #13071 we can move all the getters to a single class:

final QueryContext queryContext = query.queryContext();
final String timestampStringFromContext = queryContext.getString(CTX_KEY_FUDGE_TIMESTAMP, "");
final boolean flag = queryContext.enableJoinFilterRewrite();

The result is simpler code, while providing type safety, concurrency safety, and the ability to authorize context values.

Thoughts?

FrankChen021 · 2022-09-12T02:31:01Z

@paul-rogers I think the new APIs you proposed is what I want -- that is there's only one place for these getXXX methods.

final QueryContext queryContext = query.queryContext();
final String timestampStringFromContext = queryContext.getString(CTX_KEY_FUDGE_TIMESTAMP, "");
final boolean flag = queryContext.enableJoinFilterRewrite();

Let me remove all newly addde getContextXXXX methods in this PR from Query interface, and replace all reference to these methods to what you suggest as

final String timestampStringFromContext = QueryContexts.getAsString(query, CTX_KEY_FUDGE_TIMESTAMP, "");

I will do this ASAP so that it won't block other PRs for long time.

FrankChen021 · 2022-09-12T04:07:59Z

@paul-rogers I've replace all reference from calling getContextAsXXXX to getQueryContext().getAsXXXX, which is more similar to the final APIs that you propose.

old one:

final String timestampStringFromContext = query.getContextAsString(CTX_KEY_FUDGE_TIMESTAMP, "");

new one:

final String timestampStringFromContext = query.getQueryContext().getAsString(CTX_KEY_FUDGE_TIMESTAMP, "");

FrankChen021 · 2022-09-13T06:49:38Z

@paul-rogers Is there anything that I need to do in this PR? If it looks good to you, I think we can merge it to continue #13071

paul-rogers

LGTM (non-binding)

FrankChen021 · 2022-09-13T09:59:33Z

Thank you @paul-rogers @clintropolis
I will merge this.

clintropolis · 2022-09-13T20:51:59Z

the last commit of this PR didn't actually run through travis and seems to be failing https://github.com/apache/druid/runs/8325253980, looking into the issue

clintropolis · 2022-09-13T20:58:30Z

opened #13083 to fix the test

FrankChen021 · 2022-09-14T03:52:01Z

@clintropolis Thanks for the fix. I didn't notice that travis not started but just saw the CI status of the last commit is OK. So why did the travis not start?

FrankChen021 added 5 commits September 2, 2022 22:54

Use safe conversion methods

4b97c74

Rename method

1344848

Add getContextAsBoolean

63041bf

Update test case

803ea0a

Remove generic from getContextValue

9ae638e

FrankChen021 added Area - Querying Refactoring labels Sep 3, 2022

Update catch-handler

93db082

Add test

5bfe7e7

clintropolis approved these changes Sep 7, 2022

View reviewed changes

clintropolis reviewed Sep 7, 2022

View reviewed changes

paul-rogers reviewed Sep 7, 2022

View reviewed changes

Resolve comments

73202de

FrankChen021 requested a review from paul-rogers September 8, 2022 18:28

paul-rogers mentioned this pull request Sep 10, 2022

Fix QueryContext race condition #13049

Merged

2 tasks

paul-rogers mentioned this pull request Sep 11, 2022

Redesign QueryContext class #13071

Merged

4 tasks

Replace 'getContextXXX' to 'getQueryContext().getAsXXXX'

8f66563

paul-rogers reviewed Sep 13, 2022

View reviewed changes

FrankChen021 merged commit fd6c05e into apache:master Sep 13, 2022

FrankChen021 deleted the query_context branch September 13, 2022 10:00

clintropolis mentioned this pull request Sep 13, 2022

fix JsonParserIteratorTest #13083

Merged

kfaraz added this to the 25.0 milestone Nov 22, 2022

Conversation

FrankChen021 commented Sep 3, 2022

Description

Key changes

Uh oh!

lgtm-com Bot commented Sep 5, 2022

Uh oh!

FrankChen021 commented Sep 7, 2022

Uh oh!

clintropolis left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paul-rogers left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paul-rogers commented Sep 10, 2022

Uh oh!

paul-rogers commented Sep 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paul-rogers commented Sep 10, 2022

Uh oh!

FrankChen021 commented Sep 11, 2022

Uh oh!

paul-rogers commented Sep 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FrankChen021 commented Sep 12, 2022

Uh oh!

FrankChen021 commented Sep 12, 2022

Uh oh!

FrankChen021 commented Sep 13, 2022

Uh oh!

paul-rogers left a comment

Choose a reason for hiding this comment

Uh oh!

FrankChen021 commented Sep 13, 2022

Uh oh!

clintropolis commented Sep 13, 2022

Uh oh!

clintropolis commented Sep 13, 2022

Uh oh!

FrankChen021 commented Sep 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

paul-rogers commented Sep 10, 2022 •

edited

Loading

paul-rogers commented Sep 11, 2022 •

edited

Loading