[Feature] Support Efficient Compaction for Storage Optimization and Query Performance

### Search before asking

- [x] I searched in the [issues](https://github.com/alibaba/paimon-cpp/issues) and found nothing similar.


### Motivation

Compaction is essential for maintaining high performance and storage efficiency in modern data systems. Key benefits include:
- **For Append Tables**: Reduces small files by merging existing data files, improving scan performance and metadata scalability.
- **For Primary Key (PK) Tables**: Minimizes the number of segments that need to be merged during read-time (`merge-on-read`), significantly speeding up queries.
- **For PK+DV Tables**: Enables writing DV (`Delete Vector`) files to mark outdated rows, allowing efficient read performance.

Currently, the lack of a dedicated compaction mechanism limits our ability to optimize storage layout and query latency.

### Solution

The compaction framework should support the following capabilities:
1. **Support for both append tables and primary key (PK) tables**, with appropriate strategies for each;
2. **Execution via background tasks or manual triggers**, allowing flexibility in operation;
3. **Built-in basic compaction policies** aligned with Java Paimon;
4. **Generation of Delete Vector (DV) files** during/after compaction to track stale rows;
5. **Design support for data-evolution scenarios**, including both vertical compaction (merging small files) and horizontal compaction (consolidating partial-column files);
6. **Ensure output data format is fully compatible with Java Paimon**.

### Anything else?

_No response_

### Are you willing to submit a PR?

- [x] I'm willing to submit a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Efficient Compaction for Storage Optimization and Query Performance #93

Search before asking

Motivation

Solution

Anything else?

Are you willing to submit a PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Support Efficient Compaction for Storage Optimization and Query Performance #93

Description

Search before asking

Motivation

Solution

Anything else?

Are you willing to submit a PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions