-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](OlapScanner)fix bitmap or hll's OOM #8764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
be/src/exec/olap_scanner.cpp
Outdated
| // make sure to reset null indicators since we're overwriting | ||
| // the tuple assembled for the previous row | ||
| tuple->init(_tuple_desc->byte_size()); | ||
| batch->agg_object_pool()->remove_last_one(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give some more detail about the OOM?
If one thread(T2) add an new object into agg_object_pool, just before remove_last_one called by this thread(T1).
I think it may remove object added by T2, not the one added by T1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
batch in Status OlapScanner::get_batch(RuntimeState*, RowBatch*, bool*) was allocated by void OlapScanNode::scanner_thread(OlapScanner*), every scanner_thread has its own batch, it isn't a multi-thread case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your reply.
- Could you give me some detail steps? I want to try to reproduce the problem.
- If one tuple has bitmap and hll column at the same time, remove_last_one should be called twice?
be/src/exec/olap_scanner.cpp
Outdated
| int64_t raw_bytes_threshold = config::doris_scanner_row_bytes; | ||
| { | ||
| SCOPED_TIMER(_parent->_scan_timer); | ||
| ObjectPool tmp_object_pool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add more comments to explain why we need this tmp object pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| // so we set a soft limit, default is 1MB | ||
| CONF_mInt32(string_type_length_soft_limit_bytes, "1048576"); | ||
|
|
||
| CONF_Int32(object_pool_buffer_size, "100"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment for this new config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| _objects.clear(); | ||
| } | ||
|
|
||
| uint64_t size() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a std::lock_guard<SpinLock> l(_lock);?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
See #9205 |
Proposed changes
Issue Number: close #xxx
Problem Summary:
The _agg_object_pool in row_batch is used to hold the pointer of bitmap or hll data, which will be released later.
When a tuple check direct conjuncts or check pushdown conjuncts fail, the memory holding by _agg_object_pool which belong to the tuple also should be released. Otherwise, it will lead up to OOM。
Describe the overview of changes.
Checklist(Required)
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...