Skip to content

Conversation

@huyuanfeng2018
Copy link
Contributor

Purpose

Linked issue: close #xxx

In Paimon's system table $branches, when a user needs to query specific branch information, the previous implementation did not support pushing down the filtering criteria of branch_name.

This means that even if we only want to query one or a few specific branches, Paimon will read all branch information and then filter it at the computation engine layer

This change aims to optimize the query performance of the $branches table by implementing the pushdown of filtering conditions for the branch_name field, allowing queries to be filtered directly at the storage layer based on the branch name, thereby significantly improving query efficiency.

Tests

org.apache.paimon.flink.BranchSqlITCase##testBranchesTableFilter with adding more case for the method.

API and Format

Documentation

@huyuanfeng2018
Copy link
Contributor Author

@JingsongLi Can take some time to help me review this PR? Thank you so much~

@JingsongLi
Copy link
Contributor

@JingsongLi Can take some time to help me review this PR? Thank you so much~

Thanks @huyuanfeng2018 for the contribution. Do your scenes have many branches? I rarely see situations where there are many branches, so I haven't optimized it here either.

@huyuanfeng2018
Copy link
Contributor Author

@JingsongLi Can take some time to help me review this PR? Thank you so much~

Thanks @huyuanfeng2018 for the contribution. Do your scenes have many branches? I rarely see situations where there are many branches, so I haven't optimized it here either.

When using cdc synchronization, we generate paimon branch every hour accurately split hourly snapshots (tags can't do that), so there will be many branch

@JingsongLi
Copy link
Contributor

@JingsongLi Can take some time to help me review this PR? Thank you so much~

Thanks @huyuanfeng2018 for the contribution. Do your scenes have many branches? I rarely see situations where there are many branches, so I haven't optimized it here either.

When using cdc synchronization, we generate paimon branch every hour accurately split hourly snapshots (tags can't do that), so there will be many branch

Got it, I will take a look~

Copy link
Contributor

@JingsongLi JingsongLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JingsongLi JingsongLi merged commit 3199bcb into apache:master Sep 11, 2025
19 of 23 checks passed
@huyuanfeng2018 huyuanfeng2018 deleted the branch-push-filter branch September 11, 2025 12:30
jerry-024 added a commit to jerry-024/paimon that referenced this pull request Sep 15, 2025
* upstream/master: (23 commits)
  [flink] Use Paimon format table read for flink (apache#6246)
  [core] Ensure system tables use the correct identifier for loadTableToken in RESTTokenFileIO. (apache#6247)
  [spark] Enhance v1 write merge schema test coverage (apache#6249)
  [spark] Eliminate duplicate convertLiteral invocations (apache#6250)
  [python] Fix failing to read 1000cols (apache#6244)
  [python] Expose CatalogFactory and Schema directly (apache#6243)
  [doc] Modify Python API to JVM free (apache#6242)
  [python] Fix multiple write brefore once commit  (apache#6241)
  [core] Support push down branchesTable by branchName (apache#6231)
  [cdc] Fix PostgreSQL DECIMAL type conversion issue (apache#6239)
  [arrow] Optimize Arrow string write performance (apache#6240)
  [core] Fix checkpoint recovery failure for compacted changelog files (apache#6173)
  [core] RESTCatalog: add DLF OSS endpoint support and improve configuration merge (apache#6232)
  [core] fix RESTCatalog#listViews for system database (apache#6233)
  [core] Introduce 'ignore-update-before' to ignore UD only (apache#6235)
  [python] Fix DLF partition statistical error (apache#6237)
  [python] Add _VALUE_STATS_COLS param to fix parse wrong bytes (apache#6234)
  [ci] Rename to Python Check Code Style and Test
  [python] Rename binary row to generic row
  [hotfix] Remove methods in SchemaManager for SchemasTable
  ...
zhuyufeng0809 pushed a commit to zhuyufeng0809/flink-table-store that referenced this pull request Sep 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants