-
Notifications
You must be signed in to change notification settings - Fork 25
feat: introduce scan.tag-name option to specify scanning from tag for reading given tag #85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
SteNicholas
wants to merge
2
commits into
alibaba:main
Choose a base branch
from
SteNicholas:PAIMON-83
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
49 changes: 49 additions & 0 deletions
49
src/paimon/core/table/source/snapshot/static_from_tag_starting_scanner.h
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #pragma once | ||
|
|
||
| #include <memory> | ||
|
|
||
| #include "paimon/core/table/source/snapshot/starting_scanner.h" | ||
| #include "paimon/core/utils/tag_manager.h" | ||
|
|
||
| namespace paimon { | ||
| /// `StartingScanner` for the `CoreOptions::GetScanTagName()` of a batch read. | ||
| class StaticFromTagStartingScanner : public StartingScanner { | ||
| public: | ||
| StaticFromTagStartingScanner(const std::shared_ptr<SnapshotManager>& snapshot_manager, | ||
| const std::string& tag_name) | ||
| : StartingScanner(snapshot_manager) { | ||
| tag_name_ = tag_name; | ||
| } | ||
|
|
||
| Result<std::shared_ptr<ScanResult>> Scan( | ||
| const std::shared_ptr<SnapshotReader>& snapshot_reader) override { | ||
| const TagManager tag_manager(snapshot_manager_->Fs(), snapshot_manager_->RootPath(), | ||
| snapshot_manager_->Branch()); | ||
| PAIMON_ASSIGN_OR_RAISE(const Tag tag, tag_manager.GetOrThrow(tag_name_)); | ||
SteNicholas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| PAIMON_ASSIGN_OR_RAISE(const Snapshot snapshot, tag.TrimToSnapshot()); | ||
| PAIMON_ASSIGN_OR_RAISE( | ||
| std::shared_ptr<Plan> plan, | ||
| snapshot_reader->WithMode(ScanMode::ALL)->WithSnapshot(snapshot)->Read()); | ||
| return std::make_shared<StartingScanner::CurrentSnapshot>(plan); | ||
| } | ||
|
|
||
| private: | ||
| std::string tag_name_; | ||
| }; | ||
| } // namespace paimon | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,115 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #include "paimon/core/tag/tag.h" | ||
|
|
||
| #include <cassert> | ||
| #include <stdexcept> | ||
| #include <utility> | ||
|
|
||
| #include "paimon/common/utils/rapidjson_util.h" | ||
| #include "paimon/fs/file_system.h" | ||
| #include "paimon/result.h" | ||
| #include "paimon/status.h" | ||
| #include "rapidjson/allocators.h" | ||
| #include "rapidjson/document.h" | ||
| #include "rapidjson/rapidjson.h" | ||
|
|
||
| namespace paimon { | ||
|
|
||
| Tag::Tag(const std::optional<int32_t>& version, int64_t id, int64_t schema_id, | ||
| const std::string& base_manifest_list, | ||
| const std::optional<int64_t>& base_manifest_list_size, | ||
| const std::string& delta_manifest_list, | ||
| const std::optional<int64_t>& delta_manifest_list_size, | ||
| const std::optional<std::string>& changelog_manifest_list, | ||
| const std::optional<int64_t>& changelog_manifest_list_size, | ||
| const std::optional<std::string>& index_manifest, const std::string& commit_user, | ||
| int64_t commit_identifier, CommitKind commit_kind, int64_t time_millis, | ||
| const std::optional<std::map<int32_t, int64_t>>& log_offsets, | ||
| const std::optional<int64_t>& total_record_count, | ||
| const std::optional<int64_t>& delta_record_count, | ||
| const std::optional<int64_t>& changelog_record_count, | ||
| const std::optional<int64_t>& watermark, const std::optional<std::string>& statistics, | ||
| const std::optional<std::map<std::string, std::string>>& properties, | ||
| const std::optional<int64_t>& next_row_id, const std::optional<int64_t>& tag_create_time, | ||
| const std::optional<int64_t>& tag_time_retained) | ||
| : Snapshot(version, id, schema_id, base_manifest_list, base_manifest_list_size, | ||
| delta_manifest_list, delta_manifest_list_size, changelog_manifest_list, | ||
| changelog_manifest_list_size, index_manifest, commit_user, commit_identifier, | ||
| commit_kind, time_millis, log_offsets, total_record_count, delta_record_count, | ||
| changelog_record_count, watermark, statistics, properties, next_row_id), | ||
| tag_create_time_(tag_create_time), | ||
| tag_time_retained_(tag_time_retained) {} | ||
|
|
||
| bool Tag::operator==(const Tag& other) const { | ||
| if (this == &other) { | ||
| return true; | ||
| } | ||
| return Snapshot::operator==(other) && tag_create_time_ == other.tag_create_time_ && | ||
| tag_time_retained_ == other.tag_time_retained_; | ||
| } | ||
|
|
||
| bool Tag::TEST_Equal(const Tag& other) const { | ||
| if (this == &other) { | ||
| return true; | ||
| } | ||
|
|
||
| return Snapshot::TEST_Equal(other) && tag_create_time_ == other.tag_create_time_ && | ||
| tag_time_retained_ == other.tag_time_retained_; | ||
| } | ||
|
|
||
| Result<Snapshot> Tag::TrimToSnapshot() const { | ||
| return Snapshot(Version(), Id(), SchemaId(), BaseManifestList(), BaseManifestListSize(), | ||
| DeltaManifestList(), DeltaManifestListSize(), ChangelogManifestList(), | ||
| ChangelogManifestListSize(), IndexManifest(), CommitUser(), CommitIdentifier(), | ||
| GetCommitKind(), TimeMillis(), LogOffsets(), TotalRecordCount(), | ||
| DeltaRecordCount(), ChangelogRecordCount(), Watermark(), Statistics(), | ||
| Properties(), NextRowId()); | ||
| } | ||
|
|
||
| rapidjson::Value Tag::ToJson(rapidjson::Document::AllocatorType* allocator) const noexcept(false) { | ||
| rapidjson::Value obj(rapidjson::kObjectType); | ||
| obj = Snapshot::ToJson(allocator); | ||
| if (tag_create_time_ != std::nullopt) { | ||
| obj.AddMember(rapidjson::StringRef(FIELD_TAG_CREATE_TIME), | ||
| RapidJsonUtil::SerializeValue(tag_create_time_.value(), allocator).Move(), | ||
| *allocator); | ||
| } | ||
| if (tag_time_retained_ != std::nullopt) { | ||
| obj.AddMember(rapidjson::StringRef(FIELD_TAG_TIME_RETAINED), | ||
| RapidJsonUtil::SerializeValue(tag_time_retained_.value(), allocator).Move(), | ||
| *allocator); | ||
| } | ||
| return obj; | ||
| } | ||
|
|
||
| void Tag::FromJson(const rapidjson::Value& obj) noexcept(false) { | ||
| Snapshot::FromJson(obj); | ||
| tag_create_time_ = | ||
| RapidJsonUtil::DeserializeKeyValue<std::optional<int64_t>>(obj, FIELD_TAG_CREATE_TIME); | ||
| tag_time_retained_ = | ||
| RapidJsonUtil::DeserializeKeyValue<std::optional<int64_t>>(obj, FIELD_TAG_TIME_RETAINED); | ||
| } | ||
|
|
||
| Result<Tag> Tag::FromPath(const std::shared_ptr<FileSystem>& fs, const std::string& path) { | ||
| std::string json_str; | ||
| PAIMON_RETURN_NOT_OK(fs->ReadFile(path, &json_str)); | ||
| Tag tag; | ||
| PAIMON_RETURN_NOT_OK(RapidJsonUtil::FromJsonString(json_str, &tag)); | ||
| return tag; | ||
| } | ||
| } // namespace paimon |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| /* | ||
| * Copyright 2026-present Alibaba Inc. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
SteNicholas marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #pragma once | ||
|
|
||
| #include <cstdint> | ||
| #include <limits> | ||
| #include <map> | ||
| #include <memory> | ||
| #include <optional> | ||
| #include <string> | ||
|
|
||
| #include "paimon/common/utils/jsonizable.h" | ||
| #include "paimon/core/snapshot.h" | ||
| #include "paimon/result.h" | ||
| #include "rapidjson/allocators.h" | ||
| #include "rapidjson/document.h" | ||
| #include "rapidjson/rapidjson.h" | ||
|
|
||
| namespace paimon { | ||
| class FileSystem; | ||
|
|
||
| /// Snapshot with tagCreateTime and tagTimeRetained. | ||
| class Tag : public Snapshot { | ||
| public: | ||
| static constexpr char FIELD_TAG_CREATE_TIME[] = "tagCreateTime"; | ||
| static constexpr char FIELD_TAG_TIME_RETAINED[] = "tagTimeRetained"; | ||
|
|
||
| JSONIZABLE_FRIEND_AND_DEFAULT_CTOR(Tag); | ||
|
|
||
| Tag(const std::optional<int32_t>& version, int64_t id, int64_t schema_id, | ||
| const std::string& base_manifest_list, | ||
| const std::optional<int64_t>& base_manifest_list_size, | ||
| const std::string& delta_manifest_list, | ||
| const std::optional<int64_t>& delta_manifest_list_size, | ||
| const std::optional<std::string>& changelog_manifest_list, | ||
| const std::optional<int64_t>& changelog_manifest_list_size, | ||
| const std::optional<std::string>& index_manifest, const std::string& commit_user, | ||
| int64_t commit_identifier, CommitKind commit_kind, int64_t time_millis, | ||
| const std::optional<std::map<int32_t, int64_t>>& log_offsets, | ||
| const std::optional<int64_t>& total_record_count, | ||
| const std::optional<int64_t>& delta_record_count, | ||
| const std::optional<int64_t>& changelog_record_count, | ||
| const std::optional<int64_t>& watermark, const std::optional<std::string>& statistics, | ||
| const std::optional<std::map<std::string, std::string>>& properties, | ||
| const std::optional<int64_t>& next_row_id, const std::optional<int64_t>& tag_create_time, | ||
| const std::optional<int64_t>& tag_time_retained); | ||
|
|
||
| bool operator==(const Tag& other) const; | ||
| bool TEST_Equal(const Tag& other) const; | ||
|
|
||
| std::optional<int64_t> TagCreateTime() const { | ||
| return tag_create_time_; | ||
| } | ||
|
|
||
| std::optional<int64_t> TagTimeRetained() const { | ||
| return tag_time_retained_; | ||
| } | ||
|
|
||
| Result<Snapshot> TrimToSnapshot() const; | ||
|
|
||
| rapidjson::Value ToJson(rapidjson::Document::AllocatorType* allocator) const | ||
| noexcept(false) override; | ||
|
|
||
| void FromJson(const rapidjson::Value& obj) noexcept(false) override; | ||
|
|
||
| static Result<Tag> FromPath(const std::shared_ptr<FileSystem>& fs, const std::string& path); | ||
|
|
||
| private: | ||
| std::optional<int64_t> tag_create_time_; | ||
| std::optional<int64_t> tag_time_retained_; | ||
| }; | ||
| } // namespace paimon | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.