Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Nov 21, 2025

( I am using this PR to test, I don't intend to merge it yet )

Which issue does this PR close?

Rationale for this change

We have made non trivial progress in filter representation in Parquet. Let's see where performance is now.

What changes are included in this PR?

Are these changes tested?

By CI tests

Are there any user-facing changes?

@github-actions github-actions bot added documentation Improvements or additions to documentation optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate proto Related to proto crate datasource Changes to the datasource crate labels Nov 21, 2025
@alamb
Copy link
Contributor Author

alamb commented Nov 21, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/pushdown_filters_test (8a06d63) to d65fb86 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Nov 21, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_pushdown_filters_test
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_pushdown_filters_test ┃         Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2704.76 ms │                  2693.83 ms │      no change │
│ QQuery 1     │  1211.14 ms │                  1278.46 ms │   1.06x slower │
│ QQuery 2     │  2313.54 ms │                  2408.79 ms │      no change │
│ QQuery 3     │  1159.95 ms │                  1156.61 ms │      no change │
│ QQuery 4     │  2336.83 ms │                  2351.37 ms │      no change │
│ QQuery 5     │ 28195.81 ms │                 27958.12 ms │      no change │
│ QQuery 6     │  4058.62 ms │                   108.14 ms │ +37.53x faster │
│ QQuery 7     │  3742.15 ms │                  3701.67 ms │      no change │
└──────────────┴─────────────┴─────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 45722.81ms │
│ Total Time (alamb_pushdown_filters_test)   │ 41656.99ms │
│ Average Time (HEAD)                        │  5715.35ms │
│ Average Time (alamb_pushdown_filters_test) │  5207.12ms │
│ Queries Faster                             │          1 │
│ Queries Slower                             │          1 │
│ Queries with No Change                     │          6 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_pushdown_filters_test ┃         Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.34 ms │                     2.61 ms │   1.12x slower │
│ QQuery 1     │    50.62 ms │                    51.87 ms │      no change │
│ QQuery 2     │   136.14 ms │                   136.47 ms │      no change │
│ QQuery 3     │   162.09 ms │                   160.21 ms │      no change │
│ QQuery 4     │  1098.61 ms │                  1095.76 ms │      no change │
│ QQuery 5     │  1477.67 ms │                  1491.19 ms │      no change │
│ QQuery 6     │     2.17 ms │                     2.16 ms │      no change │
│ QQuery 7     │    54.19 ms │                    68.48 ms │   1.26x slower │
│ QQuery 8     │  1444.55 ms │                  1422.06 ms │      no change │
│ QQuery 9     │  1890.35 ms │                  1852.51 ms │      no change │
│ QQuery 10    │   363.24 ms │                   484.80 ms │   1.33x slower │
│ QQuery 11    │   427.02 ms │                   532.87 ms │   1.25x slower │
│ QQuery 12    │  1340.34 ms │                  1545.03 ms │   1.15x slower │
│ QQuery 13    │  2077.33 ms │                  2280.93 ms │   1.10x slower │
│ QQuery 14    │  1274.21 ms │                  1448.68 ms │   1.14x slower │
│ QQuery 15    │  1277.86 ms │                  1261.35 ms │      no change │
│ QQuery 16    │  2677.07 ms │                  2704.20 ms │      no change │
│ QQuery 17    │  2662.55 ms │                  2692.03 ms │      no change │
│ QQuery 18    │  5448.25 ms │                  4951.08 ms │  +1.10x faster │
│ QQuery 19    │   130.47 ms │                   143.08 ms │   1.10x slower │
│ QQuery 20    │  2003.32 ms │                  1854.73 ms │  +1.08x faster │
│ QQuery 21    │  2355.36 ms │                  2330.38 ms │      no change │
│ QQuery 22    │  3991.87 ms │                  3944.03 ms │      no change │
│ QQuery 23    │ 16491.78 ms │                  1080.71 ms │ +15.26x faster │
│ QQuery 24    │   215.96 ms │                   253.74 ms │   1.17x slower │
│ QQuery 25    │   489.11 ms │                   617.52 ms │   1.26x slower │
│ QQuery 26    │   228.11 ms │                   324.39 ms │   1.42x slower │
│ QQuery 27    │  2868.16 ms │                  2929.86 ms │      no change │
│ QQuery 28    │ 23345.61 ms │                 23660.66 ms │      no change │
│ QQuery 29    │   975.88 ms │                   994.78 ms │      no change │
│ QQuery 30    │  1333.55 ms │                  1354.66 ms │      no change │
│ QQuery 31    │  1416.85 ms │                  1351.01 ms │      no change │
│ QQuery 32    │  5442.42 ms │                  4533.41 ms │  +1.20x faster │
│ QQuery 33    │  6011.79 ms │                  5754.30 ms │      no change │
│ QQuery 34    │  6139.89 ms │                  5858.52 ms │      no change │
│ QQuery 35    │  1950.34 ms │                  1865.06 ms │      no change │
│ QQuery 36    │   121.36 ms │                    25.69 ms │  +4.72x faster │
│ QQuery 37    │    51.21 ms │                    25.18 ms │  +2.03x faster │
│ QQuery 38    │   120.24 ms │                    25.57 ms │  +4.70x faster │
│ QQuery 39    │   194.03 ms │                    25.93 ms │  +7.48x faster │
│ QQuery 40    │    43.31 ms │                    26.00 ms │  +1.67x faster │
│ QQuery 41    │    40.36 ms │                    25.77 ms │  +1.57x faster │
│ QQuery 42    │    31.97 ms │                    25.25 ms │  +1.27x faster │
└──────────────┴─────────────┴─────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 99859.52ms │
│ Total Time (alamb_pushdown_filters_test)   │ 83214.54ms │
│ Average Time (HEAD)                        │  2322.31ms │
│ Average Time (alamb_pushdown_filters_test) │  1935.22ms │
│ Queries Faster                             │         11 │
│ Queries Slower                             │         11 │
│ Queries with No Change                     │         21 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ alamb_pushdown_filters_test ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 131.35 ms │                   139.34 ms │  1.06x slower │
│ QQuery 2     │  29.56 ms │                    28.82 ms │     no change │
│ QQuery 3     │  34.98 ms │                    39.69 ms │  1.13x slower │
│ QQuery 4     │  29.36 ms │                    29.42 ms │     no change │
│ QQuery 5     │  87.73 ms │                    88.77 ms │     no change │
│ QQuery 6     │  19.59 ms │                    19.82 ms │     no change │
│ QQuery 7     │ 217.18 ms │                   221.16 ms │     no change │
│ QQuery 8     │  32.12 ms │                    34.75 ms │  1.08x slower │
│ QQuery 9     │ 103.41 ms │                   107.77 ms │     no change │
│ QQuery 10    │  64.14 ms │                    64.51 ms │     no change │
│ QQuery 11    │  19.10 ms │                    17.62 ms │ +1.08x faster │
│ QQuery 12    │  52.10 ms │                    51.44 ms │     no change │
│ QQuery 13    │  47.55 ms │                    46.22 ms │     no change │
│ QQuery 14    │  13.94 ms │                    13.57 ms │     no change │
│ QQuery 15    │  25.01 ms │                    24.55 ms │     no change │
│ QQuery 16    │  25.81 ms │                    25.46 ms │     no change │
│ QQuery 17    │ 148.10 ms │                   148.29 ms │     no change │
│ QQuery 18    │ 282.55 ms │                   280.77 ms │     no change │
│ QQuery 19    │  49.21 ms │                    38.45 ms │ +1.28x faster │
│ QQuery 20    │  50.67 ms │                    49.56 ms │     no change │
│ QQuery 21    │ 338.10 ms │                   320.93 ms │ +1.05x faster │
│ QQuery 22    │  17.73 ms │                    17.73 ms │     no change │
└──────────────┴───────────┴─────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 1819.30ms │
│ Total Time (alamb_pushdown_filters_test)   │ 1808.63ms │
│ Average Time (HEAD)                        │   82.70ms │
│ Average Time (alamb_pushdown_filters_test) │   82.21ms │
│ Queries Faster                             │         3 │
│ Queries Slower                             │         3 │
│ Queries with No Change                     │        16 │
│ Queries with Failure                       │         0 │
└────────────────────────────────────────────┴───────────┘

@alamb alamb changed the title TEST: enable pushdown_filters by default TEST: enable pushdown_filters and reorder_filters by default Nov 21, 2025
@alamb
Copy link
Contributor Author

alamb commented Nov 21, 2025

I am also testing with just filter_pushdown on:

I am going to focus my efforts on profiling these queries which seem to have gotten the most slower:

│ QQuery 24    │   215.96 ms │                   253.74 ms │   1.17x slower │
│ QQuery 25    │   489.11 ms │                   617.52 ms │   1.26x slower │
│ QQuery 26    │   228.11 ms │                   324.39 ms │   1.42x slower │

Here is the query:

set datafusion.execution.parquet.binary_as_string = true
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;

Basically my next steps are to profile these queries and see what is slower (and if it is related to filter representation, I will go focus on apache/arrow-rs#8902)

@Dandandan
Copy link
Contributor

I am also testing with just filter_pushdown on:

I am going to focus my efforts on profiling these queries which seem to have gotten the most slower:

│ QQuery 24    │   215.96 ms │                   253.74 ms │   1.17x slower │
│ QQuery 25    │   489.11 ms │                   617.52 ms │   1.26x slower │
│ QQuery 26    │   228.11 ms │                   324.39 ms │   1.42x slower │

Here is the query:

set datafusion.execution.parquet.binary_as_string = true
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;

Basically my next steps are to profile these queries and see what is slower (and if it is related to filter representation, I will go focus on apache/arrow-rs#8902)

Looks like we are very close!

FYI, there a couple more slow than query 24:

│ QQuery 26    │   228.11 ms │                   324.39 ms │   1.42x slower │
│ QQuery 10    │   363.24 ms │                   484.80 ms │   1.33x slower │
│ QQuery 7     │    54.19 ms │                    68.48 ms │   1.26x slower │
│ QQuery 25    │   489.11 ms │                   617.52 ms │   1.26x slower │
│ QQuery 11    │   427.02 ms │                   532.87 ms │   1.25x slower │

@alamb
Copy link
Contributor Author

alamb commented Nov 23, 2025

I did some more analysis:

The idea is to isolate why filter pushdown is slowing down clickbench q24

See more details here #18873

This is after upgrading to arrow 57.1.0

The only difference in the two binaries is if filter pushdown is on by default:

-rwxr-xr-x@ 1 andrewlamb  staff  81331152 Nov 23 07:31 datafusion-cli-alamb_upgrade_arrow_57.1.0
-rwxr-xr-x@ 1 andrewlamb  staff  81331152 Nov 22 07:57 datafusion-cli-almab_pushdown_no_reorder

Using hits partitioned dataset

ln -s ~/Software/datafusion/benchmarks/data/hits_partitioned ./hits

Here is q24.sql

set datafusion.execution.parquet.binary_as_string = true;

-- turn on pushdown (is hard coded)
-- set datafusion.execution.parquet.pushdown_filters = true;

SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;
SELECT "SearchPhrase" FROM hits WHERE "SearchPhrase" <> '' ORDER BY "SearchPhrase" LIMIT 10;

You can see the pushdown is slightly slower

./datafusion-cli-almab_pushdown_no_reorder -f q24.sql  | grep Elapsed
Elapsed 0.000 seconds.
Elapsed 0.183 seconds.
Elapsed 0.154 seconds.
Elapsed 0.155 seconds.
Elapsed 0.153 seconds.
Elapsed 0.154 seconds.
Elapsed 0.154 seconds.
Elapsed 0.150 seconds.
Elapsed 0.154 seconds.
Elapsed 0.156 seconds.
Elapsed 0.152 seconds.
./datafusion-cli-alamb_upgrade_arrow_57.1.0  -f q24.sql  | grep Elapsed
Elapsed 0.002 seconds.
Elapsed 0.164 seconds.
Elapsed 0.137 seconds.
Elapsed 0.137 seconds.
Elapsed 0.133 seconds.
Elapsed 0.132 seconds.
Elapsed 0.135 seconds.
Elapsed 0.131 seconds.
Elapsed 0.137 seconds.
Elapsed 0.137 seconds.
Elapsed 0.133 seconds.

So let's profile what the pushdown one is doing

Screenshot 2025-11-23 at 7 40 41 AM

So more than 5% of the time is being spent converting filters back and forth.

Thus, this gives me more motivation to keep working on

@alamb alamb force-pushed the alamb/pushdown_filters_test branch from 8a06d63 to 35f137c Compare December 1, 2025 14:15
@github-actions github-actions bot removed documentation Improvements or additions to documentation optimizer Optimizer rules core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) proto Related to proto crate datasource Changes to the datasource crate labels Dec 1, 2025
@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

run benchmarks

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

show benchmark queue

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

🤖 Hi @alamb, you asked to view the benchmark queue (#18873 (comment)).

Job User Benchmarks Comment
18415_3596731381.sh alamb sql_planner #18415 (comment)
18873_3597244013.sh alamb default #18873 (comment)

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

show benchmark queue

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

🤖 Hi @alamb, you asked to view the benchmark queue (#18873 (comment)).

Job User Benchmarks Comment
18415_3596731381.sh alamb sql_planner #18415 (comment)
18873_3597244013.sh alamb default #18873 (comment)
18938_3597335533.sh alamb default #18938 (comment)
19018_3597386014.sh alamb clickbench_partitioned https://github.com/apache/datafusion/pull/19018#issuecomment-3597386014

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/pushdown_filters_test (35f137c) to a060739 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_pushdown_filters_test
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_pushdown_filters_test ┃         Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0     │  2711.80 ms │                  2685.92 ms │      no change │
│ QQuery 1     │  1313.17 ms │                  1318.43 ms │      no change │
│ QQuery 2     │  2438.74 ms │                  2504.56 ms │      no change │
│ QQuery 3     │  1135.41 ms │                  1165.75 ms │      no change │
│ QQuery 4     │  2281.73 ms │                  2347.15 ms │      no change │
│ QQuery 5     │ 28556.47 ms │                 28492.45 ms │      no change │
│ QQuery 6     │  4032.61 ms │                   112.83 ms │ +35.74x faster │
│ QQuery 7     │  3492.12 ms │                  3566.72 ms │      no change │
└──────────────┴─────────────┴─────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 45962.06ms │
│ Total Time (alamb_pushdown_filters_test)   │ 42193.82ms │
│ Average Time (HEAD)                        │  5745.26ms │
│ Average Time (alamb_pushdown_filters_test) │  5274.23ms │
│ Queries Faster                             │          1 │
│ Queries Slower                             │          0 │
│ Queries with No Change                     │          7 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query        ┃        HEAD ┃ alamb_pushdown_filters_test ┃         Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0     │     2.16 ms │                     2.20 ms │      no change │
│ QQuery 1     │    49.97 ms │                    52.92 ms │   1.06x slower │
│ QQuery 2     │   134.65 ms │                   133.54 ms │      no change │
│ QQuery 3     │   146.66 ms │                   159.03 ms │   1.08x slower │
│ QQuery 4     │   995.98 ms │                  1088.32 ms │   1.09x slower │
│ QQuery 5     │  1438.71 ms │                  1490.25 ms │      no change │
│ QQuery 6     │     2.08 ms │                     2.22 ms │   1.06x slower │
│ QQuery 7     │    55.41 ms │                    67.60 ms │   1.22x slower │
│ QQuery 8     │  1327.29 ms │                  1387.14 ms │      no change │
│ QQuery 9     │  1762.81 ms │                  1857.22 ms │   1.05x slower │
│ QQuery 10    │   367.70 ms │                   505.61 ms │   1.38x slower │
│ QQuery 11    │   414.08 ms │                   547.13 ms │   1.32x slower │
│ QQuery 12    │  1285.49 ms │                  1488.82 ms │   1.16x slower │
│ QQuery 13    │  2059.54 ms │                  2337.14 ms │   1.13x slower │
│ QQuery 14    │  1198.17 ms │                  1435.75 ms │   1.20x slower │
│ QQuery 15    │  1163.01 ms │                  1216.60 ms │      no change │
│ QQuery 16    │  2599.77 ms │                  2668.75 ms │      no change │
│ QQuery 17    │  2556.38 ms │                  2635.02 ms │      no change │
│ QQuery 18    │  4927.13 ms │                  5024.47 ms │      no change │
│ QQuery 19    │   120.47 ms │                   140.44 ms │   1.17x slower │
│ QQuery 20    │  1855.38 ms │                  1859.32 ms │      no change │
│ QQuery 21    │  2134.39 ms │                  2310.59 ms │   1.08x slower │
│ QQuery 22    │  3697.45 ms │                  3905.37 ms │   1.06x slower │
│ QQuery 23    │ 12291.16 ms │                  1095.79 ms │ +11.22x faster │
│ QQuery 24    │   202.08 ms │                   243.70 ms │   1.21x slower │
│ QQuery 25    │   457.55 ms │                   617.69 ms │   1.35x slower │
│ QQuery 26    │   206.06 ms │                   327.42 ms │   1.59x slower │
│ QQuery 27    │  2746.55 ms │                  2987.58 ms │   1.09x slower │
│ QQuery 28    │ 23784.27 ms │                 24396.96 ms │      no change │
│ QQuery 29    │   935.30 ms │                   946.44 ms │      no change │
│ QQuery 30    │  1273.32 ms │                  1343.55 ms │   1.06x slower │
│ QQuery 31    │  1307.27 ms │                  1345.85 ms │      no change │
│ QQuery 32    │  5117.86 ms │                  4836.50 ms │  +1.06x faster │
│ QQuery 33    │  5826.23 ms │                  5558.47 ms │      no change │
│ QQuery 34    │  6060.65 ms │                  5904.23 ms │      no change │
│ QQuery 35    │  1792.18 ms │                  1856.82 ms │      no change │
│ QQuery 36    │   114.89 ms │                    25.30 ms │  +4.54x faster │
│ QQuery 37    │    51.58 ms │                    25.11 ms │  +2.05x faster │
│ QQuery 38    │   113.82 ms │                    24.87 ms │  +4.58x faster │
│ QQuery 39    │   183.52 ms │                    25.36 ms │  +7.24x faster │
│ QQuery 40    │    41.08 ms │                    27.14 ms │  +1.51x faster │
│ QQuery 41    │    37.26 ms │                    24.82 ms │  +1.50x faster │
│ QQuery 42    │    29.96 ms │                    25.05 ms │  +1.20x faster │
└──────────────┴─────────────┴─────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 92867.24ms │
│ Total Time (alamb_pushdown_filters_test)   │ 83954.11ms │
│ Average Time (HEAD)                        │  2159.70ms │
│ Average Time (alamb_pushdown_filters_test) │  1952.42ms │
│ Queries Faster                             │          9 │
│ Queries Slower                             │         19 │
│ Queries with No Change                     │         15 │
│ Queries with Failure                       │          0 │
└────────────────────────────────────────────┴────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ alamb_pushdown_filters_test ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 128.16 ms │                   128.04 ms │     no change │
│ QQuery 2     │  26.43 ms │                    28.63 ms │  1.08x slower │
│ QQuery 3     │  37.85 ms │                    33.39 ms │ +1.13x faster │
│ QQuery 4     │  28.17 ms │                    29.02 ms │     no change │
│ QQuery 5     │  86.13 ms │                    85.98 ms │     no change │
│ QQuery 6     │  19.43 ms │                    22.58 ms │  1.16x slower │
│ QQuery 7     │ 215.17 ms │                   220.50 ms │     no change │
│ QQuery 8     │  29.10 ms │                    34.01 ms │  1.17x slower │
│ QQuery 9     │ 103.49 ms │                    98.95 ms │     no change │
│ QQuery 10    │  62.29 ms │                    64.16 ms │     no change │
│ QQuery 11    │  17.68 ms │                    17.84 ms │     no change │
│ QQuery 12    │  50.77 ms │                    51.88 ms │     no change │
│ QQuery 13    │  45.36 ms │                    53.22 ms │  1.17x slower │
│ QQuery 14    │  13.53 ms │                    16.47 ms │  1.22x slower │
│ QQuery 15    │  24.38 ms │                    29.49 ms │  1.21x slower │
│ QQuery 16    │  24.96 ms │                    30.18 ms │  1.21x slower │
│ QQuery 17    │ 146.92 ms │                   178.31 ms │  1.21x slower │
│ QQuery 18    │ 277.82 ms │                   320.22 ms │  1.15x slower │
│ QQuery 19    │  37.17 ms │                    46.23 ms │  1.24x slower │
│ QQuery 20    │  49.42 ms │                    56.88 ms │  1.15x slower │
│ QQuery 21    │ 316.37 ms │                   311.15 ms │     no change │
│ QQuery 22    │  17.44 ms │                    17.34 ms │     no change │
└──────────────┴───────────┴─────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 1758.04ms │
│ Total Time (alamb_pushdown_filters_test)   │ 1874.46ms │
│ Average Time (HEAD)                        │   79.91ms │
│ Average Time (alamb_pushdown_filters_test) │   85.20ms │
│ Queries Faster                             │         1 │
│ Queries Slower                             │        11 │
│ Queries with No Change                     │        10 │
│ Queries with Failure                       │         0 │
└────────────────────────────────────────────┴───────────┘

@rluvaton
Copy link
Member

rluvaton commented Dec 1, 2025

@alamb I think that calls to benchmark queue should not link to other prs as and making it confusing that unrelated prs just got connected (every time I see a link I open it to see what it's about, but now I just see it is benchmark queue)

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

@alamb I think that calls to benchmark queue should not link to other prs as and making it confusing that unrelated prs just got connected (every time I see a link I open it to see what it's about, but now I just see it is benchmark queue)

Done in alamb/datafusion-benchmarking@773d0db

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

show benchmark queue

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

run benchmark tpch

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

🤖 ./gh_compare_branch.sh Benchmark Script Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/pushdown_filters_test (35f137c) to a060739 diff using: tpch
Results will be posted here when complete

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

🤖 Hi @alamb, you asked to view the benchmark queue (#18873 (comment)).

Job User Benchmarks Comment
18873_3598546024.sh alamb tpch https://github.com/apache/datafusion/pull/18873#issuecomment-3598546024

@alamb
Copy link
Contributor Author

alamb commented Dec 1, 2025

🤖: Benchmark completed

Details

Comparing HEAD and alamb_pushdown_filters_test
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query        ┃      HEAD ┃ alamb_pushdown_filters_test ┃        Change ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1     │ 210.78 ms │                   222.44 ms │  1.06x slower │
│ QQuery 2     │  93.62 ms │                   219.88 ms │  2.35x slower │
│ QQuery 3     │ 124.73 ms │                   199.46 ms │  1.60x slower │
│ QQuery 4     │  94.48 ms │                   144.45 ms │  1.53x slower │
│ QQuery 5     │ 178.92 ms │                   286.78 ms │  1.60x slower │
│ QQuery 6     │  61.22 ms │                   159.71 ms │  2.61x slower │
│ QQuery 7     │ 236.81 ms │                   346.60 ms │  1.46x slower │
│ QQuery 8     │ 158.19 ms │                   376.02 ms │  2.38x slower │
│ QQuery 9     │ 229.25 ms │                   470.96 ms │  2.05x slower │
│ QQuery 10    │ 176.88 ms │                   274.03 ms │  1.55x slower │
│ QQuery 11    │  75.03 ms │                   176.84 ms │  2.36x slower │
│ QQuery 12    │ 113.26 ms │                   241.47 ms │  2.13x slower │
│ QQuery 13    │ 239.16 ms │                   219.31 ms │ +1.09x faster │
│ QQuery 14    │  85.43 ms │                   101.15 ms │  1.18x slower │
│ QQuery 15    │ 122.64 ms │                   172.58 ms │  1.41x slower │
│ QQuery 16    │  50.17 ms │                    93.57 ms │  1.86x slower │
│ QQuery 17    │ 270.05 ms │                   357.39 ms │  1.32x slower │
│ QQuery 18    │ 306.89 ms │                   431.66 ms │  1.41x slower │
│ QQuery 19    │ 137.35 ms │                   191.87 ms │  1.40x slower │
│ QQuery 20    │ 121.93 ms │                   178.97 ms │  1.47x slower │
│ QQuery 21    │ 258.32 ms │                   452.43 ms │  1.75x slower │
│ QQuery 22    │  57.71 ms │                    69.50 ms │  1.20x slower │
└──────────────┴───────────┴─────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                          ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                          │ 3402.84ms │
│ Total Time (alamb_pushdown_filters_test)   │ 5387.05ms │
│ Average Time (HEAD)                        │  154.67ms │
│ Average Time (alamb_pushdown_filters_test) │  244.87ms │
│ Queries Faster                             │         1 │
│ Queries Slower                             │        21 │
│ Queries with No Change                     │         0 │
│ Queries with Failure                       │         0 │
└────────────────────────────────────────────┴───────────┘

@rluvaton
Copy link
Member

rluvaton commented Dec 1, 2025

@alamb I think that calls to benchmark queue should not link to other prs as and making it confusing that unrelated prs just got connected (every time I see a link I open it to see what it's about, but now I just see it is benchmark queue)

thanks

@rluvaton
Copy link
Member

rluvaton commented Dec 1, 2025

You piqued my interest with why this is slow.

couple of questions:

  1. what are the sized of the left and right boolean buffers? maybe they are very large and each copy is expensive
  2. who produce the selection masks? can they reuse a mutable boolean buffer?

ideas:

  1. you could try to reuse the same buffer when combining selection masks and thus avoid copy every time
  2. keep track of some estimate of how many true exists in the selection mask for each and_then
    for large number of true and large number right selection mask you should work in chunks rather than bits
  3. keep some kind of data struct that let you track whether it is better to do

@alamb
Copy link
Contributor Author

alamb commented Dec 3, 2025

You piqued my interest with why this is slow.

couple of questions:

  1. what are the sized of the left and right boolean buffers? maybe they are very large and each copy is expensive
  2. who produce the selection masks? can they reuse a mutable boolean buffer?

ideas:

  1. you could try to reuse the same buffer when combining selection masks and thus avoid copy every time
  2. keep track of some estimate of how many true exists in the selection mask for each and_then
    for large number of true and large number right selection mask you should work in chunks rather than bits
  3. keep some kind of data struct that let you track whether it is better to do

Thanks @rluvaton

I think the sizes are typically the batch size (8192 rows)

the masks come from https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/trait.ArrowPredicate.html (which DataFusion provdes)

I think reason it is currently slower is that the BooleanArrays are converted back to RowSelections always -- specifically https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowSelection.html#method.from_filters

For patterns with many small selections, this is much worse and takes a lot of time

This is basically what I am working on avoiding in apache/arrow-rs#8902

The ideas are good. I will try and incorporate them in apache/arrow-rs#8902

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable parquet filter pushdown (filter_pushdown) by default

3 participants