Skip to content

Conversation

@zhiqiang-hhhh
Copy link
Contributor

cherry pick from #41273

…many queries are running. (apache#41273)

1. Minor refactor for scanner constructor, calculation of
_max_thread_num is moved to init method
2. The expected value of _max_thread_num is changed. There is no need to
submit too many scan task to scan scheduler, since thread num is
limited.
3. Calculation of _max_bytes_in_queue is changed. _max_bytes_in_queue
for each scan instance is limited to 100MB by default.

```
mysql [tpch]>select count(*) from supplier;
--------------
select count(*) from supplier
--------------

+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.04 sec)

mysql [tpch]>select count(*) from revenue0;
--------------
select count(*) from revenue0
--------------

+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.19 sec)
```
To illustrate the effect, we need to create much scanners, so 
```
set global experimental_parallel_scan_min_rows_per_scanner=29715
```
default value is `2097152`, we can make scanner num almost equal to
`experimental_parallel_scan_max_scanners_count` which is 48.

Lets use mysqlslap to do concurrent test.

Current master:
```text
[hezhiqiang@VM-10-8-centos be_1]$ mysqlslap -hxxxx -uroot -Pyyyy  --create-schema=tpch -c 20 -i 5 -q "select     s_suppkey,     s_name,     s_address,     s_phone,     total_revenue from     supplier,     revenue0 where     s_suppkey = supplier_no     and total_revenue = (         select             max(total_revenue)         from             revenue0     ) order by     s_suppkey;"
Benchmark
	Average number of seconds to run all queries: 12.480 seconds
	Minimum number of seconds to run all queries: 12.159 seconds
	Maximum number of seconds to run all queries: 12.843 seconds
	Number of clients running queries: 20
	Average number of queries per client: 1

[hezhiqiang@VM-10-8-centos be_1]$ mysqlslap -hyyyy -uroot -Pyyyy  --create-schema=tpch -c 25 -i 5 -q "select     s_suppkey,     s_name,     s_address,     s_phone,     total_revenue from     supplier,     revenue0 where     s_suppkey = supplier_no     and total_revenue = (         select             max(total_revenue)         from             revenue0     ) order by     s_suppkey;"
mysqlslap: Cannot run query select     s_suppkey,     s_name,     s_address,     s_phone,     total_revenue from     supplier,     revenue0 where     s_suppkey = supplier_no     and total_revenue = (         select             max(total_revenue)         from             revenue0     ) order by     s_suppkey; 
ERROR : errCode = 2, detailMessage = (10.16.10.8)[TOO_MANY_TASKS]Failed to submit scanner to scanner pool reason:Thread pool Scan_normal is at capacity (192/192 tasks running, 102400/102400 tasks queued)|type:0
```

After this pr
```
[hezhiqiang@VM-10-8-centos lib]$ mysqlslap -hxxx -uroot -Pxxx  --create-schema=tpch -c 50 -i 5 -q "select     s_suppkey,     s_name,     s_address,     s_phone,     total_revenue from     supplier,     revenue0 where     s_suppkey = supplier_no     and total_revenue = (         select             max(total_revenue)         from             revenue0     ) order by     s_suppkey;"
Benchmark
	Average number of seconds to run all queries: 31.520 seconds
	Minimum number of seconds to run all queries: 30.164 seconds
	Maximum number of seconds to run all queries: 34.131 seconds
	Number of clients running queries: 50
	Average number of queries per client: 1
```

The max concurrency increased from 25 to 50.

Actually, for sequential query test, the performance does not decrease,
`submit_many_scan_tasks_for_potential_performance_issue` can be remove
in the future.
@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@yiguolei yiguolei merged commit 5a03f85 into apache:branch-3.0 Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants