-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Enhancement](ExternalTable)Optimize the performance of getCachedRowCount when reading ExternalTable #41659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement](ExternalTable)Optimize the performance of getCachedRowCount when reading ExternalTable #41659
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
1 similar comment
|
run buildall |
90d6893 to
c3535c9
Compare
|
run buildall |
| // ExternalTable.getRowCount(), but this is not very meaningful and time-consuming. | ||
| // The getCachedRowCount() function is only used when `show table` and querying `information_schema.tables`. | ||
| if (!checkInitialized()) { | ||
| return -2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use -1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will fix it
| return false; | ||
| } | ||
|
|
||
| public boolean checkInitialized() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need this method, we can use check objectCreated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will remove this method.
| private boolean enableHmsEventsIncrementalSync = false; | ||
|
|
||
| //for "type" = "hms" , but is iceberg table. | ||
| HiveCatalog icebergHiveCatalog; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| HiveCatalog icebergHiveCatalog; | |
| private HiveCatalog icebergHiveCatalog; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will fix it
| } | ||
|
|
||
| public Catalog getIcebergHiveCatalog() { | ||
| if (icebergHiveCatalog == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method may be called concurrently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for prevent this, i will move icebergHiveCatalog init to initLocalObjectsImpl()
| HiveCatalog hiveCatalog = new HiveCatalog(); | ||
| hiveCatalog.setConf(conf); | ||
|
|
||
| if (props.containsKey(HMSExternalCatalog.BIND_BROKER_NAME)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept this part and removed the else part. As it seemed to me that the two were duplicates. The else part just initialize hivecatalog use uriproperty.
c3535c9 to
3d68a83
Compare
…ount when reading ExternalTable
3d68a83 to
7526b02
Compare
|
run buildall |
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
| HiveCatalog hiveCatalog = new org.apache.iceberg.hive.HiveCatalog(); | ||
| hiveCatalog.setConf(externalCatalog.getConfiguration()); | ||
|
|
||
| Map<String, String> catalogProperties = externalCatalog.getProperties(); | ||
| String metastoreUris = catalogProperties.getOrDefault(HMSProperties.HIVE_METASTORE_URIS, ""); | ||
| catalogProperties.put(CatalogProperties.URI, metastoreUris); | ||
|
|
||
| hiveCatalog.initialize(name, catalogProperties); | ||
| return hiveCatalog; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is same as IcebergHMSExternalCatalog.initCatalog ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes . so now IcebergHMSExternalCatalog.initCatalog() method use this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ount when reading ExternalTable (apache#41659) ## Proposed changes Because ExternalTable will initialize the previously uninitialized table when `getCachedRowCount()`, which is unnecessary. So for the uninitialized table, we directly return -1. This will increase the speed of our query `information_schema.tables`.
…ount when reading ExternalTable (apache#41659) Because ExternalTable will initialize the previously uninitialized table when `getCachedRowCount()`, which is unnecessary. So for the uninitialized table, we directly return -1. This will increase the speed of our query `information_schema.tables`.
…ount when reading ExternalTable (apache#41659) ## Proposed changes Because ExternalTable will initialize the previously uninitialized table when `getCachedRowCount()`, which is unnecessary. So for the uninitialized table, we directly return -1. This will increase the speed of our query `information_schema.tables`.
…ount when reading ExternalTable (#41659) (#41959) bp #41659 ## Proposed changes Because ExternalTable will initialize the previously uninitialized table when `getCachedRowCount()`, which is unnecessary. So for the uninitialized table, we directly return -1. This will increase the speed of our query `information_schema.tables`.
…ount when reading ExternalTable (#41659) (#41962) bp #41659 ## Proposed changes Because ExternalTable will initialize the previously uninitialized table when `getCachedRowCount()`, which is unnecessary. So for the uninitialized table, we directly return -1. This will increase the speed of our query `information_schema.tables`.
Proposed changes
Because ExternalTable will initialize the previously uninitialized table when
getCachedRowCount(), which is unnecessary. So for the uninitialized table, we directly return -1.This will increase the speed of our query
information_schema.tables.