-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[enhance](iceberg) Refactor Iceberg metadata cache structure and add table cache test cases #59716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run external |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 32189 ms |
TPC-DS: Total hot run time: 173580 ms |
FE Regression Coverage ReportIncrement line coverage |
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergMetadataCache.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/IcebergMetadataCache.java
Show resolved
Hide resolved
1574f3a to
2391454
Compare
|
run buildall |
TPC-H: Total hot run time: 32499 ms |
ClickBench: Total hot run time: 28.32 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 32768 ms |
ClickBench: Total hot run time: 28.25 s |
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 33103 ms |
ClickBench: Total hot run time: 28.19 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the Iceberg metadata cache structure to improve code organization, reduce memory overhead, and add comprehensive test coverage for table cache behavior. The main improvements include consolidating three separate caches into two, implementing lazy loading for snapshot cache, and fixing a spelling error in a method name.
Changes:
- Introduced
IcebergTableCacheValueto encapsulate table metadata with lazy-loaded snapshot cache - Removed redundant
snapshotListCacheandsnapshotCache, consolidating them intoIcebergTableCacheValue - Renamed
getLastedIcebergSnapshottogetLatestIcebergSnapshot(spelling correction) - Added comprehensive test suite covering DML operations, schema changes, and partition evolution
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
IcebergTableCacheValue.java |
New class implementing lazy-loaded snapshot cache with thread-safe double-checked locking |
IcebergMetadataCache.java |
Simplified cache structure from 3 caches to 2; removed snapshot-specific caches and cache statistics |
IcebergUtils.java |
Refactored method signatures to support passing Table instances; renamed methods for consistency |
IcebergExternalCatalog.java |
Removed deprecated ICEBERG_SNAPSHOT_META_CACHE_TTL_SECOND property |
IcebergExternalTable.java |
Updated to use new simplified cache API methods |
IcebergDlaTable.java |
Updated to use new cache API for Hive-based Iceberg tables |
HMSExternalTable.java |
Updated to use renamed snapshot cache methods |
test_iceberg_table_cache.groovy |
Comprehensive new test covering cache behavior with DML, DDL, and partition operations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
What problem does this PR solve?
Description
Changes
This PR refactors the Iceberg metadata cache structure to improve code organization and adds comprehensive test cases for table cache behavior.
Main Changes
1. Refactored IcebergMetadataCache
IcebergTableCacheValueto encapsulate table-related metadatasnapshotListCacheandsnapshotCacheIcebergTableCacheValuewith lazy loadingtableCacheandviewCacheBefore:
After:
2. Lazy Loading for Snapshot Cache
IcebergTableCacheValue.getSnapshotCacheValue()3. Simplified Cache API
getIcebergTable(): Returns the Table object directly fromIcebergTableCacheValuegetSnapshotCache(): Returns snapshot cache value with lazy loadinggetSnapshotList(): Returns snapshot list from the Table object4. Test Cases
test_iceberg_table_cacheto verify cache behaviorBenefits
IcebergTableCacheValueinstead of multiple separate cachesTest Results
test_iceberg_table_cache.groovyREFRESH TABLERelated Files
Core Changes:
IcebergMetadataCache.java- Refactored cache structureIcebergTableCacheValue.java- New class to encapsulate table metadataIcebergExternalCatalog.java- Updated cache-related configurationsTests:
test_iceberg_table_cache.groovy- Comprehensive cache behavior testsSuite.groovy- UpdatedgetSparkIcebergContainerName()implementationCheck List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)