Skip to content

Conversation

@Kikyou1997
Copy link
Contributor

Proposed changes

Support dispaly of auto analyze jobs

After this PR, users and DBA could use such grammar to check the execution status of auto analyze jobs:

SHOW AUTO ANALYZE [tbl_name] [WHERE STATE='SOME STATE']

Record count of history auto analyze job could be configured by setting FE option: auto_analyze_job_record_count, default value is 2000

Enhance auto analyze

After this PR, auto jobs those created automatically will no longer execute beyond a specific time frame.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@Kikyou1997
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 50.34 seconds
stream load tsv: 552 seconds loaded 74807831229 Bytes, about 129 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17162337230 Bytes

@Kikyou1997
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 49.22 seconds
stream load tsv: 578 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.3 seconds inserted 10000000 Rows, about 341K ops/s
storage size: 17162333796 Bytes

@Kikyou1997
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.61 seconds
stream load tsv: 577 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17162236039 Bytes

@Kikyou1997
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.75 seconds
stream load tsv: 579 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17162350487 Bytes

@morrySnow morrySnow changed the title [enhancement] Support dispaly of auto analyze jobs [enhancement] (stats) Support display of auto analyze jobs Sep 11, 2023
@Kikyou1997 Kikyou1997 changed the title [enhancement] (stats) Support display of auto analyze jobs [enhancement](optimizer) Support display of auto analyze jobs Sep 12, 2023
@Kikyou1997
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.94 seconds
stream load tsv: 602 seconds loaded 74807831229 Bytes, about 118 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17162376487 Bytes

@Kikyou1997
Copy link
Contributor Author

run buildall

@morrySnow morrySnow changed the title [enhancement](optimizer) Support display of auto analyze jobs [opt](stats) Support display of auto analyze jobs Sep 13, 2023
morrySnow
morrySnow previously approved these changes Sep 13, 2023
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 13, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.12 seconds
stream load tsv: 599 seconds loaded 74807831229 Bytes, about 119 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162199830 Bytes

@Kikyou1997
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Sep 14, 2023
@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 48.29 seconds
stream load tsv: 610 seconds loaded 74807831229 Bytes, about 116 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162338248 Bytes

@englefly
Copy link
Contributor

Shall we expose job table schema to user? Let users create any filter condition they want.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 14, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@Kikyou1997
Copy link
Contributor Author

Shall we expose job table schema to user? Let users create any filter condition they want.

We will but not now

@morrySnow morrySnow merged commit 0be0b8f into apache:master Sep 14, 2023
morningman pushed a commit that referenced this pull request Oct 13, 2023
…anch 2.0 (#25119)

This PR is composed of belowing commits which has been merged to Doirs master:

* #24769
* #24672
* #24599
* #24521
* #24405
* #24237
* #24135
* #24074
* #24026
* #23992
* #23978
* #23622
* #23507
* #23354
* #23103
* #22963
* #22896
* #22775
* #22773
morningman pushed a commit that referenced this pull request Oct 15, 2023
….0 (#25421)

This PR is composed of belowing commits which has been merged to Doirs master:

* #24769
* #24672
* #24599
* #24521
* #24405
* #24237
* #24135
* #24074
* #24026
* #23992
* #23978
* #23622
* #23507
* #23354
* #23103
* #22963
* #22896
* #22775
* #22773

After this PR, when user upgrade Doris from 2.0.2 to 2.0.3, the origin info in AnalysisManager will be ignored, and the new module AnalysisManagerV2 will be saved(with more info).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants