-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[core] Support delete stats in result of scan plan. #4506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[core] Support delete stats in result of scan plan. #4506
Conversation
a0d5f66 to
1c5dbff
Compare
3bc538e to
4e6a4f0
Compare
|
@JingsongLi Hi,Please CC, Thx. |
| valueStatsCols); | ||
| } | ||
|
|
||
| public DataFileMeta withoutStats() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyWithoutStats
| rowCount, | ||
| minKey, | ||
| maxKey, | ||
| EMPTY_STATS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to keep key stats?
| deleteRowCount, | ||
| embeddedIndex, | ||
| fileSource, | ||
| valueStatsCols); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty list
| file.embeddedIndex()); | ||
| } | ||
|
|
||
| public ManifestEntry withoutStats() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyWithoutStats
| } | ||
|
|
||
| @Override | ||
| public FileStoreScan withoutStatsInPlan() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropStats
|
|
||
| FileStoreScan withMetrics(ScanMetrics metrics); | ||
|
|
||
| FileStoreScan withoutStatsInPlan(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropStats
| } | ||
|
|
||
| @Override | ||
| public AbstractDataTableScan withoutStatsInPlan() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropStats
| return this; | ||
| } | ||
|
|
||
| default InnerTableScan withoutStatsInPlan() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropStats
| ReadBuilder withShard(int indexOfThisSubtask, int numberOfParallelSubtasks); | ||
|
|
||
| /** Delete stats in scan plan result. */ | ||
| ReadBuilder withoutStatsInPlan(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropStats
4e6a4f0 to
1783216
Compare
1783216 to
2f53687
Compare
|
@JingsongLi |
JingsongLi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Purpose
In my company's production environment, when use Flink session cluster for OLAP scan Paimon, we found the JobManager's memory is always heavy.
So, we will optimize this by two ways:
(1) Delete stats in DataSplit.
(2) When dataSkipping, cut unused stats in ManifestEntry.
This pr is for (1)
Linked issue: close #xxx
Tests
API and Format
Documentation