Skip to content

Improve the fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes#16679

Merged
LakshSingla merged 8 commits intoapache:masterfrom
LakshSingla:fallback-not-working-2
Jul 12, 2024
Merged

Improve the fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes#16679
LakshSingla merged 8 commits intoapache:masterfrom
LakshSingla:fallback-not-working-2

Conversation

@LakshSingla
Copy link
Copy Markdown
Contributor

@LakshSingla LakshSingla commented Jul 1, 2024

Description

Better fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes:
a. We don't touch the subquery sequence till we know that we can materialize the result as frames. Otherwise, aggregators holding some resources can get closed and the fallback doesn't work properly. I have added a test case to verify that.
b. Remove the ad-hoc fallback case, which I haven't seen happen. Most of the queries fallback due to insufficient types present at the runtime to generate the query. But if we are unable to materialize the result as bytes due to any unforeseen reason, the current fallback path is incorrect. The other alternative is to rerun the whole subquery, but that will degrade the subquery performance significantly, and we should rather throw an exception in that case so that the user can disable maxSubqueryBytes.

Release note


Key changed/added classes in this PR
  • MyFoo
  • OurBar
  • TheirBaz

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

"d0",
ColumnType.STRING
))
.addAggregator(new CardinalityAggregatorFactory(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: doesn't it feel kind of strange to be mixing non-datasketches approx distinct count with datasketches extensions tests?

.putAll(QUERY_CONTEXT_DEFAULT)
.put(QueryContexts.MAX_SUBQUERY_BYTES_KEY, "100000")
// Disallows the fallback to row based limiting
.put(QueryContexts.MAX_SUBQUERY_ROWS_KEY, "10")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the test case. Ideally we would want a similar test in processing module. Is that possible ? Maybe use a test aggregator ?

@LakshSingla
Copy link
Copy Markdown
Contributor Author

LakshSingla commented Jul 11, 2024

@cryptoe Unfortunately, I couldn't get it to work with the client query segment walker tests - because that portion of the code base heavily mimics broker-historical interaction. So even with modifications to that portion of the code, we are getting a new sequence that will work with and without the patch. I have attached the diff I was trying below.

single-serve.patch

@cryptoe
Copy link
Copy Markdown
Contributor

cryptoe commented Jul 12, 2024

Since we warp the test framework returns a repeatable sequence, we would need to change the underlying UT framework which is not in the scope of this PR.

We already have a test hence going forward with merge.

@LakshSingla
Copy link
Copy Markdown
Contributor Author

The failing coverage is on a defensive check we don't expect to hit.

@LakshSingla LakshSingla merged commit 3a1b437 into apache:master Jul 12, 2024
@LakshSingla LakshSingla deleted the fallback-not-working-2 branch July 12, 2024 16:19
sreemanamala pushed a commit to sreemanamala/druid that referenced this pull request Aug 6, 2024
…e the subquery's results as frames for estimating the bytes (apache#16679)

Better fallback strategy when the broker is unable to materialize the subquery's results as frames for estimating the bytes:
a. We don't touch the subquery sequence till we know that we can materialize the result as frames
@kfaraz kfaraz added this to the 31.0.0 milestone Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants