-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[core] support decouple the delta files lifecycle #3178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Perhaps it's best for us to have an abstract mechanism to ensure that we don't keep both the changelog and delta files at the same time, so that we can better understand this set of things. |
078b05a to
fa73cb7
Compare
I extract this logic to In this PR, I also refactor the |
9c06546 to
2ae1028
Compare
| // expire | ||
| checkAnswer( | ||
| spark.sql("CALL paimon.sys.expire_snapshots(table => 'test.T', retain_max => 2)"), | ||
| spark.sql("CALL paimon.sys.expire_snapshots(table => 'test.T', retain_max => 2, retain_min => 1)"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, if user not specify the retain_min, the default value is 1 in ExpireSnapshotsImpl, now the default value is fallback to the CoreOptions.SNAPSHOT_RETAIN_MIN = 10, so if max is 2, we should manually specify the retain_min => 1. I think the current behavior is more consistent, I'm not sure whether this will break the compatibility. Please also help check this cc @JingsongLi
|
Already merged to three PRs. |
Purpose
This PR is meant to support decouple the delta files lifecycle #2899
The basic idea behind this is that:
DatafileMetato indicate whether this file is generated as anAPPENDorCOMPACTfileAPPENDfiles in data filebaseanddeltamanifest file for thenoneproducer are also postpone to deleteAbout why we need
FileSourcein DataFileMetaFor
nonechangelog producer, onlyAPPENDcommits are required for stream read. In aCOMPACTcommit, some files from the compact or append could be marked as delete. We should delete the files from the compact commit and keep the files from the append commit for further stream read. So we need a flag to distinguish the file source (compact or append).Linked issue: close #xxx
Tests
API and Format
Introduce
FileSourcein DataFileMetaDocumentation