[improvement](memory) fix olap table scan and sink memory usage problem #8451

xiaokang · 2022-03-11T15:43:33Z

Proposed changes

Issue Number: NA

Problem Summary:

Due to unlimited queue in OlapScanNode and NodeChannel, memory usage can be very large for reading and writting large table, e.g 'insert into tableB select * from tableA'.

Checklist(Required)

Does it affect the original behavior: (No)
Has unit tests been added: (No)
Has document been added or modified: (No Need)
Does it need to update dependencies: (No)
Are there any changes that cannot be rolled back: (No)

Further comments

Add bytes limit (2.5% of query_options.mem_limit) for _scan_row_batches queue in OlapScanNode. There was no limit for _scan_row_batches yet.
Add bytes limit (2.5% of exec_mem_limit) for _materialized_row_batches queue in OlapScanNode. There was only queue size limit for _materialized_row_batches before, and it's not enough for data with large average bytes of a single row.
Add bytes limit for reading a single batch. A new be config doris_scanner_row_bytes is added and its default value is 10MB. There was only row number limit before, and it's also not enough for data with large average bytes of a single row.
Add bytes limit (5% of load_mem_limit) for _max_pending_batches queue in NodeChannel.
Fixed an exceptional zero value of max_thread in OlapScanNode.

morningman · 2022-03-11T16:42:22Z

LGTM, but there are 2 more things:

It is better to provide some data to illustrate the memory usage before and after the modification in order to give more information to the reviewer.
Also need to modify the src/vec/exec/volap_scan_node.cpp

xiaokang · 2022-03-12T02:57:37Z

Hi @morningman, Thanks for your quick response!

It is better to provide some data to illustrate the memory usage before and after the modification in order to give more information to the reviewer.

Test environment:
A single node cluster with 32g memory and 8 cpu cores. All configuration is default.

Table schema:
CREATE TABLE tableA (
uuid text NULL,
no text NULL,
f text NULL,
m text NULL,
data text NULL,
time1 datetime NULL,
time2 datetime NULL,
p text NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(uuid)
PARTITION BY LIST(p)
(PARTITION p1 VALUES IN ("1"),
...
PARTITION p98 VALUES IN ("98"),
PARTITION p99 VALUES IN ("99"))
DISTRIBUTED BY HASH(uuid) BUCKETS 10
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2"
)

Test steps:

create tableA and tableB with 100 partition.
insert 1 million rows with large average size 100KB to tableA.
execute query: 'set batch_size = 128; set exec_mem_limit=8147483648; insert into tableB select * from table A;'

Result:

master branch: be process quickly consume all 32g memory in about 20 seconds and killed by Linux oom-killer.
fix branch: The memory usage of be process is 14g memory at peak and fall back to 8g when query finished. 8g is for be storage page cache (32g*20%=6g) and chunk_reserved_bytes (2g).

Also need to modify the src/vec/exec/volap_scan_node.cpp

I will try to do the same things for the vectorized code.

xiaokang · 2022-03-13T08:47:22Z

@morningman volap_scan_node.cpp is done. The test result is almost the same as non-vectorized version.

morningman

LGTM

github-actions · 2022-03-13T11:50:59Z

PR approved by at least one committer and no changes requested.

github-actions · 2022-03-13T11:51:01Z

PR approved by anyone and no changes requested.

…em (#8451) Due to unlimited queue in OlapScanNode and NodeChannel, memory usage can be very large for reading and writing large table, e.g 'insert into tableB select * from tableA'.

…em (apache#8451) Due to unlimited queue in OlapScanNode and NodeChannel, memory usage can be very large for reading and writing large table, e.g 'insert into tableB select * from tableA'.

fix olap table scan and sink memory comsumption problem

181cacd

morningman added kind/improvement area/memory-consumption dev/1.0.0-deprecated should be merged into dev-1.0.0 branch labels Mar 11, 2022

morningman changed the title ~~fix olap table scan and sink memory usage problem~~ [improvement](memory) fix olap table scan and sink memory usage problem Mar 11, 2022

transport memory problem fix to vectorized olap table scan

fa86f4d

morningman approved these changes Mar 13, 2022

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 13, 2022

github-actions bot added the reviewed label Mar 13, 2022

morningman merged commit e807e8b into apache:master Mar 13, 2022

morningman added dev/merged-1.0.0-deprecated PR has been merged into dev-1.0.0 and removed dev/1.0.0-deprecated should be merged into dev-1.0.0 branch labels Mar 13, 2022

xiaokang mentioned this pull request Mar 14, 2022

[feature-wip] (memory tracker) (step1) Refactor impl of MemTracker, and related use #8322

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improvement](memory) fix olap table scan and sink memory usage problem #8451

[improvement](memory) fix olap table scan and sink memory usage problem #8451

Uh oh!

xiaokang commented Mar 11, 2022

Uh oh!

morningman commented Mar 11, 2022

Uh oh!

xiaokang commented Mar 12, 2022

Uh oh!

xiaokang commented Mar 13, 2022

Uh oh!

morningman left a comment

Uh oh!

github-actions bot commented Mar 13, 2022

Uh oh!

github-actions bot commented Mar 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[improvement](memory) fix olap table scan and sink memory usage problem #8451

[improvement](memory) fix olap table scan and sink memory usage problem #8451

Uh oh!

Conversation

xiaokang commented Mar 11, 2022

Proposed changes

Problem Summary:

Checklist(Required)

Further comments

Uh oh!

morningman commented Mar 11, 2022

Uh oh!

xiaokang commented Mar 12, 2022

Uh oh!

xiaokang commented Mar 13, 2022

Uh oh!

morningman left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 13, 2022

Uh oh!

github-actions bot commented Mar 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants