-
Notifications
You must be signed in to change notification settings - Fork 181
Support composite aggregation paginating #4884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support composite aggregation paginating #4884
Conversation
|
Important Review skippedMore than 25% of the files skipped due to max files limit. The review is being skipped to prevent a low-quality review. 81 files out of 189 files are above the max files limit of 100. Please upgrade to Pro plan to get higher limits. You can disable this status message by setting the WalkthroughThis PR introduces support for pushing down HAVING clauses in Calcite query optimization and implements composite aggregation pagination. It adds a new HavingPushdownRule, modifies request/response handling to support paginating aggregations with afterKey state management, and includes comprehensive test coverage with expected output files for HAVING aggregation scenarios. Changes
Sequence DiagramsequenceDiagram
participant Optimizer
participant HavingPushdown as HavingPushdownRule
participant Scan as CalciteLogicalIndexScan
participant Builder as OpenSearchRequestBuilder
participant Query as OpenSearchQueryRequest
participant Response as OpenSearchResponse
participant Enumerator as OpenSearchIndexEnumerator
Optimizer->>HavingPushdown: onMatch(LogicalFilter → Project → IndexScan)
HavingPushdown->>Scan: pushDownHavingClauseFlag(filter)
alt Conditions Met
Scan->>Scan: Add HAVING to PushDownContext
Scan-->>HavingPushdown: Return new scan with HAVING flag
HavingPushdown-->>Optimizer: Transform plan with HAVING
else Conditions Not Met
Scan-->>HavingPushdown: Return null
end
Optimizer->>Builder: build()
alt Has Aggregate with HAVING
Builder->>Builder: Set paginatingAgg = true
Builder->>Query: Create with paginatingAgg=true
else Regular Path
Builder->>Query: Create with paginatingAgg=false
end
Query->>Query: Execute search()
alt paginatingAgg Flow
Query->>Query: searchPaginatingAgg()
Query->>Response: Capture CompositeAggregation.afterKey
Response-->>Query: Return response with afterKey
alt No More Results
Response->>Response: noCompositeAfterKey()
Response-->>Enumerator: true
Enumerator->>Enumerator: Stop fetching batches
else More Results
Query->>Query: Set afterKey in CompositeAggregationBuilder
Query->>Query: Execute next search
end
else Regular Flow
Query->>Query: Single-page or PIT search
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
Comment |
d008621 to
f65cac0
Compare
Signed-off-by: Lantao Jin <ltjin@amazon.com>
f65cac0 to
987149c
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
| { | ||
| "name": "OpenSearchIndexScan", | ||
| "description": { | ||
| "request": "OpenSearchQueryRequest(indexName=accounts, sourceBuilder={\"from\":0,\"size\":10000,\"timeout\":\"1m\",\"query\":{\"bool\":{\"filter\":[{\"range\":{\"age\":{\"from\":null,\"to\":20,\"include_lower\":true,\"include_upper\":false,\"boost\":1.0}}},{\"range\":{\"age\":{\"from\":10,\"to\":null,\"include_lower\":false,\"include_upper\":true,\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"_source\":{\"includes\":[\"age\"],\"excludes\":[]}}, searchDone=false)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove all needClean and searchDone in plan.
|
|
||
| import org.junit.After; | ||
|
|
||
| public class CalcitePPLAggregationPaginatingIT extends CalcitePPLAggregationIT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add CalcitePPLAggregationPaginatingIT and CalcitePPLTpchPaginatingIT to verifiy the results have no changes with paginating feature.
| CalciteLogicalIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]]) | ||
| physical: | | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0},avg_age=AVG($1)), PROJECT->[avg_age, age_range], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"age_range":{"terms":{"script":{"source":"{\"langType\":\"calcite\",\"script\":\"rO0ABXQGFHsKICAib3AiOiB7CiAgICAibmFtZSI6ICJDQVNFIiwKICAgICJraW5kIjogIkNBU0UiLAogICAgInN5bnRheCI6ICJTUEVDSUFMIgogIH0sCiAgIm9wZXJhbmRzIjogWwogICAgewogICAgICAib3AiOiB7CiAgICAgICAgIm5hbWUiOiAiPCIsCiAgICAgICAgImtpbmQiOiAiTEVTU19USEFOIiwKICAgICAgICAic3ludGF4IjogIkJJTkFSWSIKICAgICAgfSwKICAgICAgIm9wZXJhbmRzIjogWwogICAgICAgIHsKICAgICAgICAgICJkeW5hbWljUGFyYW0iOiAwLAogICAgICAgICAgInR5cGUiOiB7CiAgICAgICAgICAgICJ0eXBlIjogIklOVEVHRVIiLAogICAgICAgICAgICAibnVsbGFibGUiOiB0cnVlCiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICB7CiAgICAgICAgICAiZHluYW1pY1BhcmFtIjogMSwKICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAidHlwZSI6ICJJTlRFR0VSIiwKICAgICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiAyLAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiBmYWxzZSwKICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgfQogICAgfSwKICAgIHsKICAgICAgIm9wIjogewogICAgICAgICJuYW1lIjogIlNFQVJDSCIsCiAgICAgICAgImtpbmQiOiAiU0VBUkNIIiwKICAgICAgICAic3ludGF4IjogIklOVEVSTkFMIgogICAgICB9LAogICAgICAib3BlcmFuZHMiOiBbCiAgICAgICAgewogICAgICAgICAgImR5bmFtaWNQYXJhbSI6IDMsCiAgICAgICAgICAidHlwZSI6IHsKICAgICAgICAgICAgInR5cGUiOiAiSU5URUdFUiIsCiAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUKICAgICAgICAgIH0KICAgICAgICB9LAogICAgICAgIHsKICAgICAgICAgICJsaXRlcmFsIjogewogICAgICAgICAgICAicmFuZ2VTZXQiOiBbCiAgICAgICAgICAgICAgWwogICAgICAgICAgICAgICAgImNsb3NlZCIsCiAgICAgICAgICAgICAgICAiMzAiLAogICAgICAgICAgICAgICAgIjQwIgogICAgICAgICAgICAgIF0KICAgICAgICAgICAgXSwKICAgICAgICAgICAgIm51bGxBcyI6ICJVTktOT1dOIgogICAgICAgICAgfSwKICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAidHlwZSI6ICJJTlRFR0VSIiwKICAgICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiA0LAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiBmYWxzZSwKICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgfQogICAgfSwKICAgIHsKICAgICAgImR5bmFtaWNQYXJhbSI6IDUsCiAgICAgICJ0eXBlIjogewogICAgICAgICJ0eXBlIjogIlZBUkNIQVIiLAogICAgICAgICJudWxsYWJsZSI6IGZhbHNlLAogICAgICAgICJwcmVjaXNpb24iOiAtMQogICAgICB9CiAgICB9CiAgXQp9\"}","lang":"opensearch_compounded_script","params":{"utcTimestamp": 0,"SOURCES":[0,2,2,0,2,2],"DIGESTS":["age",30,"u30","age","u40","u100"]}},"missing_bucket":true,"missing_order":"first","order":"asc"}}}]},"aggregations":{"avg_age":{"avg":{"field":"age"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)]) | ||
| CalciteEnumerableIndexScan(table=[[OpenSearch, opensearch-sql_test_index_bank]], PushDownContext=[[AGGREGATION->rel#:LogicalAggregate.NONE.[](input=RelSubset#,group={0},avg_age=AVG($1)), PROJECT->[avg_age, age_range], LIMIT->10000], OpenSearchRequestBuilder(sourceBuilder={"from":0,"size":0,"timeout":"1m","aggregations":{"composite_buckets":{"composite":{"size":1000,"sources":[{"age_range":{"terms":{"script":{"source":"{\"langType\":\"calcite\",\"script\":\"rO0ABXQGFHsKICAib3AiOiB7CiAgICAibmFtZSI6ICJDQVNFIiwKICAgICJraW5kIjogIkNBU0UiLAogICAgInN5bnRheCI6ICJTUEVDSUFMIgogIH0sCiAgIm9wZXJhbmRzIjogWwogICAgewogICAgICAib3AiOiB7CiAgICAgICAgIm5hbWUiOiAiPCIsCiAgICAgICAgImtpbmQiOiAiTEVTU19USEFOIiwKICAgICAgICAic3ludGF4IjogIkJJTkFSWSIKICAgICAgfSwKICAgICAgIm9wZXJhbmRzIjogWwogICAgICAgIHsKICAgICAgICAgICJkeW5hbWljUGFyYW0iOiAwLAogICAgICAgICAgInR5cGUiOiB7CiAgICAgICAgICAgICJ0eXBlIjogIklOVEVHRVIiLAogICAgICAgICAgICAibnVsbGFibGUiOiB0cnVlCiAgICAgICAgICB9CiAgICAgICAgfSwKICAgICAgICB7CiAgICAgICAgICAiZHluYW1pY1BhcmFtIjogMSwKICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAidHlwZSI6ICJJTlRFR0VSIiwKICAgICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiAyLAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiBmYWxzZSwKICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgfQogICAgfSwKICAgIHsKICAgICAgIm9wIjogewogICAgICAgICJuYW1lIjogIlNFQVJDSCIsCiAgICAgICAgImtpbmQiOiAiU0VBUkNIIiwKICAgICAgICAic3ludGF4IjogIklOVEVSTkFMIgogICAgICB9LAogICAgICAib3BlcmFuZHMiOiBbCiAgICAgICAgewogICAgICAgICAgImR5bmFtaWNQYXJhbSI6IDMsCiAgICAgICAgICAidHlwZSI6IHsKICAgICAgICAgICAgInR5cGUiOiAiSU5URUdFUiIsCiAgICAgICAgICAgICJudWxsYWJsZSI6IHRydWUKICAgICAgICAgIH0KICAgICAgICB9LAogICAgICAgIHsKICAgICAgICAgICJsaXRlcmFsIjogewogICAgICAgICAgICAicmFuZ2VTZXQiOiBbCiAgICAgICAgICAgICAgWwogICAgICAgICAgICAgICAgImNsb3NlZCIsCiAgICAgICAgICAgICAgICAiMzAiLAogICAgICAgICAgICAgICAgIjQwIgogICAgICAgICAgICAgIF0KICAgICAgICAgICAgXSwKICAgICAgICAgICAgIm51bGxBcyI6ICJVTktOT1dOIgogICAgICAgICAgfSwKICAgICAgICAgICJ0eXBlIjogewogICAgICAgICAgICAidHlwZSI6ICJJTlRFR0VSIiwKICAgICAgICAgICAgIm51bGxhYmxlIjogZmFsc2UKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIF0KICAgIH0sCiAgICB7CiAgICAgICJkeW5hbWljUGFyYW0iOiA0LAogICAgICAidHlwZSI6IHsKICAgICAgICAidHlwZSI6ICJWQVJDSEFSIiwKICAgICAgICAibnVsbGFibGUiOiBmYWxzZSwKICAgICAgICAicHJlY2lzaW9uIjogLTEKICAgICAgfQogICAgfSwKICAgIHsKICAgICAgImR5bmFtaWNQYXJhbSI6IDUsCiAgICAgICJ0eXBlIjogewogICAgICAgICJ0eXBlIjogIlZBUkNIQVIiLAogICAgICAgICJudWxsYWJsZSI6IGZhbHNlLAogICAgICAgICJwcmVjaXNpb24iOiAtMQogICAgICB9CiAgICB9CiAgXQp9\"}","lang":"opensearch_compounded_script","params":{"utcTimestamp": 0,"SOURCES":[0,2,2,0,2,2],"DIGESTS":["age",30,"u30","age","u40","u100"]}},"missing_bucket":true,"missing_order":"first","order":"asc"}}}]},"aggregations":{"avg_age":{"avg":{"field":"age"}}}}}}, requestedTotalSize=10000, pageSize=null, startFrom=0)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
main change in yarm files is updating requestedTotalSize to limited value for composite aggregation.
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4884-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9930665c372f433eea4aeb04b5a4cfcd51be3e9e
# Push it to GitHub
git push --set-upstream origin backport/backport-4884-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
* Support composite aggregation paginating in HAVING clause Signed-off-by: Lantao Jin <ltjin@amazon.com> * typo Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix doctest and IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * secruity it Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert changes in OpenSearchIndexScan Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix compile error Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix v2 paginationIT Signed-off-by: Lantao Jin <ltjin@amazon.com> * optimize request total size in compoisite agg Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix it Signed-off-by: Lantao Jin <ltjin@amazon.com> * Refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Asif Bashar <asif.bashar@gmail.com>
* Support composite aggregation paginating in HAVING clause Signed-off-by: Lantao Jin <ltjin@amazon.com> * typo Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix doctest and IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * secruity it Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert changes in OpenSearchIndexScan Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix compile error Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix v2 paginationIT Signed-off-by: Lantao Jin <ltjin@amazon.com> * optimize request total size in compoisite agg Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix it Signed-off-by: Lantao Jin <ltjin@amazon.com> * Refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> Signed-off-by: Asif Bashar <asif.bashar@gmail.com>
* Support composite aggregation paginating in HAVING clause Signed-off-by: Lantao Jin <ltjin@amazon.com> * typo Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix doctest and IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * secruity it Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert changes in OpenSearchIndexScan Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix compile error Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix v2 paginationIT Signed-off-by: Lantao Jin <ltjin@amazon.com> * optimize request total size in compoisite agg Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix it Signed-off-by: Lantao Jin <ltjin@amazon.com> * Refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com>
|
The backport to To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4884-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9930665c372f433eea4aeb04b5a4cfcd51be3e9e
# Push it to GitHub
git push --set-upstream origin backport/backport-4884-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-devThen, create a pull request where the |
* Support composite aggregation paginating in HAVING clause Signed-off-by: Lantao Jin <ltjin@amazon.com> * typo Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix doctest and IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * secruity it Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert changes in OpenSearchIndexScan Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix compile error Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix v2 paginationIT Signed-off-by: Lantao Jin <ltjin@amazon.com> * optimize request total size in compoisite agg Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix it Signed-off-by: Lantao Jin <ltjin@amazon.com> * Refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 9930665)
…4930) * Support composite aggregation paginating (#4884) * Support composite aggregation paginating in HAVING clause Signed-off-by: Lantao Jin <ltjin@amazon.com> * typo Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix doctest and IT Signed-off-by: Lantao Jin <ltjin@amazon.com> * secruity it Signed-off-by: Lantao Jin <ltjin@amazon.com> * revert changes in OpenSearchIndexScan Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix compile error Signed-off-by: Lantao Jin <ltjin@amazon.com> * Fix v2 paginationIT Signed-off-by: Lantao Jin <ltjin@amazon.com> * optimize request total size in compoisite agg Signed-off-by: Lantao Jin <ltjin@amazon.com> * fix it Signed-off-by: Lantao Jin <ltjin@amazon.com> * Refactor Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com> (cherry picked from commit 9930665) * Fix UT Signed-off-by: Lantao Jin <ltjin@amazon.com> --------- Signed-off-by: Lantao Jin <ltjin@amazon.com>
Description
Support composite aggregation paginating
This PR fix the correctness of following clickbench queries:
Q28, Q29: aggregate with having clause
Q42: offset exceeds the bucket size
and other queries such aggregations within join
Q28:
In PPL query,
where c > 100000actually is a HAVING clause:Q42:
In PPL query, the from 10000 exceeds the default bucket size
Query aggregations within join:
SQL
PPL
Related Issues
Resolves #4836
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
Release Notes
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.