Skip to content

Expand the functionality of Local Runtime Filter  #7891

@yibin87

Description

@yibin87

Enhancement

Current Local Runtime Filter design can be found here. However, there is no noticable performance improvement for TPCH 100 benchmark. After doing some quick experiments, we confirmed that we can expand the current local runtime filter to acheive 10%+ performance improvements:

  1. Expand the scope of current local runtime filter to include broadcast hash join that across multiple tasks. It can reduce some exchange cost.
    For example, A Join (Agg_on_B), A is chosen as build table and broadcast to all TiFlash probe nodes. B first applies two-phase agg operator, and the agg's output is used as probe side. This case doesn't match current local runtime filter pattern, thus won't generate any runtime filters. By expanding the scope, we can push down the filter to TableScan of B, reducing exchange cost introduced by two-phase-agg.

  2. In TiFlash, push runtime filter down to storage layer using late materialization techs.
    In TiFlash, current runtime filters will take effect as RS operator in storage layer. RS operator uses min max index inside to implement RS-IN filter operation. And the filter effect of RS-IN filter is not good in TPCH cases even if the Real-IN filter has very good filter effect. For example: if one column "pack" contains 1024 integer values, from 1 to 1024, no duplicate values. And the pushed down IN filter contains {3, 10, 1023}, then for RS-IN filter, the total "pack" pass the filter, and no data is filtered; for Real-IN filter, only 3 values can pass the filter.

TiDB & TiPB

  • Update Runtime Filter tipb protocol to add 'apply_late_materialization' flag.
  • Recognize broadcast hash join across fragments as Runtime Filter Sources, and set 'apply_late_materialization' flag when exchange cost can be reduced.

TiFlash

  • Change Runtime Filter Manager from MPPTask level to Query level
  • Introduce new Function used as PlaceHolder in compilation time, and do nothing when Runtime Filter failed; do In filter when Runtime Filter successed
  • Build push down filters when 'apply_late_materialization'flag is set in Runtime Filter.
  • Endless/FullStack tests

Metadata

Metadata

Assignees

Labels

type/enhancementThe issue or PR belongs to an enhancement.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions