Skip to content

Conversation

@Jibing-Li
Copy link
Contributor

@Jibing-Li Jibing-Li commented Aug 28, 2023

  1. Fix auto analyze external table recursively load schema cache bug.
  2. Move some function in StatisticsAutoAnalyzer class to TableIf. So that external table and internal table could implement the logic separately.
  3. Disable external catalog auto analyze by default, could open it by adding catalog property "enable.auto.analyze"="true"

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@Jibing-Li
Copy link
Contributor Author

run buildall

@Jibing-Li Jibing-Li marked this pull request as ready for review August 28, 2023 11:24
@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.98 seconds
stream load tsv: 540 seconds loaded 74807831229 Bytes, about 132 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.0 seconds inserted 10000000 Rows, about 344K ops/s
storage size: 17162113681 Bytes

@morningman
Copy link
Contributor

Need test case

@Jibing-Li
Copy link
Contributor Author

Need test case

Added test case for recursive load schema. Cases for auto analyze will be added in later PR.

@Jibing-Li
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.58 seconds
stream load tsv: 537 seconds loaded 74807831229 Bytes, about 132 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.2 seconds inserted 10000000 Rows, about 342K ops/s
storage size: 17161938104 Bytes

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 31, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit d6450a3 into apache:master Sep 1, 2023
@Jibing-Li Jibing-Li deleted the auto branch September 1, 2023 03:02
@morningman morningman added not-merge/2.0 do not merge into 2.0 branch and removed merge_conflict labels Sep 10, 2023
Jibing-Li added a commit to Jibing-Li/incubator-doris that referenced this pull request Oct 13, 2023
1. Fix auto analyze external table recursively load schema cache bug.
2. Move some function in StatisticsAutoAnalyzer class to TableIf. So that external table and internal table could implement the logic separately.
3. Disable external catalog auto analyze by default, could open it by adding catalog property "enable.auto.analyze"="true"
@morningman morningman removed the not-merge/2.0 do not merge into 2.0 branch label Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.3-merged merge_conflict reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants