Skip to content

fix: the average time for clickbench query compute should use new vec to make it compute for each query#15472

Merged
2010YOUY01 merged 1 commit intoapache:mainfrom
zhuqi-lucas:fix_avgtime_clickbench
Mar 28, 2025
Merged

fix: the average time for clickbench query compute should use new vec to make it compute for each query#15472
2010YOUY01 merged 1 commit intoapache:mainfrom
zhuqi-lucas:fix_avgtime_clickbench

Conversation

@zhuqi-lucas
Copy link
Copy Markdown
Contributor

@zhuqi-lucas zhuqi-lucas commented Mar 28, 2025

Which issue does this PR close?

Rationale for this change

The average time for clickbench query compute should use new vec to make it compute for each query

What changes are included in this PR?

The average time for clickbench query compute should use new vec to make it compute for each query

Are these changes tested?

Yes

Q42: SELECT DATE_TRUNC('minute', to_timestamp_seconds("EventTime")) AS M, COUNT(*) AS PageViews FROM hits WHERE "CounterID" = 62 AND "EventDate"::INT::DATE >= '2013-07-14' AND "EventDate"::INT::DATE <= '2013-07-15' AND "IsRefresh" = 0 AND "DontCountHits" = 0 GROUP BY DATE_TRUNC('minute', to_timestamp_seconds("EventTime")) ORDER BY DATE_TRUNC('minute', M) LIMIT 10 OFFSET 1000;
Query 42 iteration 0 took 56.9 ms and returned 10 rows
Query 42 iteration 1 took 54.6 ms and returned 10 rows
Query 42 iteration 2 took 56.5 ms and returned 10 rows
Query 42 iteration 3 took 54.4 ms and returned 10 rows
Query 42 iteration 4 took 50.4 ms and returned 10 rows
Query 42 avg time: 54.57 ms

Before this PR:

Q42: SELECT DATE_TRUNC('minute', to_timestamp_seconds("EventTime")) AS M, COUNT(*) AS PageViews FROM hits WHERE "CounterID" = 62 AND "EventDate"::INT::DATE >= '2013-07-14' AND "EventDate"::INT::DATE <= '2013-07-15' AND "IsRefresh" = 0 AND "DontCountHits" = 0 GROUP BY DATE_TRUNC('minute', to_timestamp_seconds("EventTime")) ORDER BY DATE_TRUNC('minute', M) LIMIT 10 OFFSET 1000;
Query 42 iteration 0 took 52.7 ms and returned 10 rows
Query 42 iteration 1 took 48.1 ms and returned 10 rows
Query 42 iteration 2 took 53.8 ms and returned 10 rows
Query 42 iteration 3 took 52.8 ms and returned 10 rows
Query 42 iteration 4 took 50.2 ms and returned 10 rows
Query 42 avg time: 1121.93 ms

Are there any user-facing changes?

Fix the wrong average time.

@zhuqi-lucas zhuqi-lucas changed the title fix: the average time for clickbench query compute should outside the iterator loop fix: the average time for clickbench query compute should use new vec to make it compute for each query Mar 28, 2025
@zhuqi-lucas zhuqi-lucas force-pushed the fix_avgtime_clickbench branch from bce76ba to a2051e0 Compare March 28, 2025 08:23
@zhuqi-lucas
Copy link
Copy Markdown
Contributor Author

cc @2010YOUY01 @xudong963

Copy link
Copy Markdown
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

@2010YOUY01 2010YOUY01 merged commit fa452e6 into apache:main Mar 28, 2025
29 of 30 checks passed
qstommyshu pushed a commit to qstommyshu/datafusion that referenced this pull request Mar 28, 2025
nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The average time compute for clickbench query is wrong

3 participants