From 6fe4ea488106f9f4878d83b0ef44c17353bd7e0e Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Sun, 15 Mar 2026 07:14:41 -0400 Subject: [PATCH 1/3] [branch-53] Update Release Notes --- dev/changelog/53.0.0.md | 61 +++++++++++++++++++++++++++++++---------- 1 file changed, 46 insertions(+), 15 deletions(-) diff --git a/dev/changelog/53.0.0.md b/dev/changelog/53.0.0.md index 91306c7f49a6d..88c4eae15477a 100644 --- a/dev/changelog/53.0.0.md +++ b/dev/changelog/53.0.0.md @@ -19,7 +19,7 @@ under the License. # Apache DataFusion 53.0.0 Changelog -This release consists of 447 commits from 105 contributors. See credits at the end of this changelog for more information. +This release consists of 475 commits from 107 contributors. See credits at the end of this changelog for more information. See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgrading.html) for information on how to upgrade from previous versions. @@ -33,10 +33,11 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Add `ScalarValue::RunEndEncoded` variant [#19895](https://github.com/apache/datafusion/pull/19895) (Jefffrey) - minor: remove unused crypto functions & narrow public API [#20045](https://github.com/apache/datafusion/pull/20045) (Jefffrey) - Wrap immutable plan parts into Arc (make creating `ExecutionPlan`s less costly) [#19893](https://github.com/apache/datafusion/pull/19893) (askalt) -- feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations [#19930](https://github.com/apache/datafusion/pull/19930) (mkleen) +- feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations [#19930](https://github.com/apache/datafusion/pull/19930) (mkleen) - Remove the statistics() api in execution plan [#20319](https://github.com/apache/datafusion/pull/20319) (xudong963) - Remove recursive const check in `simplify_const_expr` [#20234](https://github.com/apache/datafusion/pull/20234) (AdamGS) - Cache `PlanProperties`, add fast-path for `with_new_children` [#19792](https://github.com/apache/datafusion/pull/19792) (askalt) +- [branch-53] feat: parse `JsonAccess` as a binary operator, add `Operator::Colon` [#20717](https://github.com/apache/datafusion/pull/20717) (Samyak2) **Performance related:** @@ -96,6 +97,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - perf: Use Arrow vectorized eq kernel for IN list with column references [#20528](https://github.com/apache/datafusion/pull/20528) (zhangxffff) - perf: Optimize `array_agg()` using `GroupsAccumulator` [#20504](https://github.com/apache/datafusion/pull/20504) (neilconway) - perf: Optimize `array_to_string()`, support more types [#20553](https://github.com/apache/datafusion/pull/20553) (neilconway) +- [branch-53] perf: sort replace free()->try_grow() pattern with try_resize() to reduce memory pool interactions [#20733](https://github.com/apache/datafusion/pull/20733) (mbutrovich) **Implemented enhancements:** @@ -191,7 +193,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - fix: Handle Utf8View and LargeUtf8 separators in concat_ws [#20361](https://github.com/apache/datafusion/pull/20361) (neilconway) - fix: HashJoin panic with dictionary-encoded columns in multi-key joins [#20441](https://github.com/apache/datafusion/pull/20441) (Tim-53) - fix: handle out of range errors in DATE_BIN instead of panicking [#20221](https://github.com/apache/datafusion/pull/20221) (mishop-15) -- fix: prevent duplicate alias collision with user-provided \_\_datafusion_extracted names [#20432](https://github.com/apache/datafusion/pull/20432) (adriangb) +- fix: prevent duplicate alias collision with user-provided __datafusion_extracted names [#20432](https://github.com/apache/datafusion/pull/20432) (adriangb) - fix: SortMergeJoin don't wait for all input before emitting [#20482](https://github.com/apache/datafusion/pull/20482) (rluvaton) - fix: `cardinality()` of an empty array should be zero [#20533](https://github.com/apache/datafusion/pull/20533) (neilconway) - fix: Unaccounted spill sort in row_hash [#20314](https://github.com/apache/datafusion/pull/20314) (EmilyMatt) @@ -230,7 +232,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Disallow positional struct casting when field names don’t overlap [#19955](https://github.com/apache/datafusion/pull/19955) (kosiew) - docs: fix docstring formatting [#20158](https://github.com/apache/datafusion/pull/20158) (Jefffrey) - Break upgrade guides into separate pages [#20183](https://github.com/apache/datafusion/pull/20183) (mishop-15) -- Better document the relationship between `FileFormat::projection` / `FileFormat::filter` and `FileScanConfig::Statistics` [#20188](https://github.com/apache/datafusion/pull/20188) (alamb) +- Better document the relationship between `FileFormat::projection` / `FileFormat::filter` and `FileScanConfig::Statistics` [#20188](https://github.com/apache/datafusion/pull/20188) (alamb) - Document the relationship between FileFormat::projection / FileFormat::filter and FileScanConfig::output_ordering [#20196](https://github.com/apache/datafusion/pull/20196) (alamb) - More documentation on `FileSource::table_schema` and `FileSource::projection` [#20242](https://github.com/apache/datafusion/pull/20242) (alamb) - chore(deps): bump setuptools from 80.10.2 to 82.0.0 in /docs [#20255](https://github.com/apache/datafusion/pull/20255) (dependabot[bot]) @@ -244,6 +246,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - add redirect for old upgrading.html URL to fix broken changelog links [#20582](https://github.com/apache/datafusion/pull/20582) (mishop-15) - Upgrade DataFusion to arrow-rs/parquet 58.0.0 / `object_store` 0.13.0 [#19728](https://github.com/apache/datafusion/pull/19728) (alamb) - Document guidance on how to evaluate breaking API changes [#20584](https://github.com/apache/datafusion/pull/20584) (alamb) +- [branch-53] chore: prepare 53 release [#20649](https://github.com/apache/datafusion/pull/20649) (comphead) **Other:** @@ -252,7 +255,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Update dependencies [#19667](https://github.com/apache/datafusion/pull/19667) (alamb) - Refactor PartitionedFile: add ordering field and new_from_meta constructor [#19596](https://github.com/apache/datafusion/pull/19596) (adriangb) - Remove coalesce batches rule and deprecate CoalesceBatchesExec [#19622](https://github.com/apache/datafusion/pull/19622) (feniljain) -- Perf: Optimize `substring_index` via single-byte fast path and direct indexing [#19590](https://github.com/apache/datafusion/pull/19590) (lyne7-sc) +- Perf: Optimize `substring_index` via single-byte fast path and direct indexing [#19590](https://github.com/apache/datafusion/pull/19590) (lyne7-sc) - refactor: Use `Signature::coercible` for isnan/iszero [#19604](https://github.com/apache/datafusion/pull/19604) (kumarUjjawal) - Parquet: Push down supported list predicates (array_has/any/all) during decoding [#19545](https://github.com/apache/datafusion/pull/19545) (kosiew) - Remove dependency on `rust_decimal`, remove ignore of `RUSTSEC-2026-0001` [#19666](https://github.com/apache/datafusion/pull/19666) (alamb) @@ -326,7 +329,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - chore(deps): bump chrono from 0.4.42 to 0.4.43 [#19897](https://github.com/apache/datafusion/pull/19897) (dependabot[bot]) - Improve error message when string functions receive Binary types [#19819](https://github.com/apache/datafusion/pull/19819) (lemorage) - Refactor ListArray hashing to consider only sliced values [#19500](https://github.com/apache/datafusion/pull/19500) (Jefffrey) -- feat(datafusion-spark): implement spark compatible `unhex` function [#19909](https://github.com/apache/datafusion/pull/19909) (lyne7-sc) +- feat(datafusion-spark): implement spark compatible `unhex` function [#19909](https://github.com/apache/datafusion/pull/19909) (lyne7-sc) - Support API for "pre-image" for pruning predicate evaluation [#19722](https://github.com/apache/datafusion/pull/19722) (sdf-jkl) - Support LargeUtf8 as partition column [#19942](https://github.com/apache/datafusion/pull/19942) (paleolimbot) - chore(deps): bump actions/checkout from 6.0.1 to 6.0.2 [#19953](https://github.com/apache/datafusion/pull/19953) (dependabot[bot]) @@ -348,7 +351,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - minor: Move metric `page_index_rows_pruned` to verbose level in `EXPLAIN ANALYZE` [#20026](https://github.com/apache/datafusion/pull/20026) (2010YOUY01) - Tweak `adapter serialization` example [#20035](https://github.com/apache/datafusion/pull/20035) (adriangb) - Simplify wait_complete function [#19937](https://github.com/apache/datafusion/pull/19937) (LiaCastaneda) -- [main] Update version to `52.1.0` (#19878) [#20028](https://github.com/apache/datafusion/pull/20028) (alamb) +- [main] Update version to `52.1.0` (#19878) [#20028](https://github.com/apache/datafusion/pull/20028) (alamb) - Fix/parquet opener page index policy [#19890](https://github.com/apache/datafusion/pull/19890) (aviralgarg05) - minor: add tests for coercible signature considering nulls/dicts/ree [#19459](https://github.com/apache/datafusion/pull/19459) (Jefffrey) - Enforce `clippy::allow_attributes` globally across workspace [#19576](https://github.com/apache/datafusion/pull/19576) (Jefffrey) @@ -399,7 +402,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Minor: verify plan output and unique field names [#20220](https://github.com/apache/datafusion/pull/20220) (alamb) - Add more tests to projection_pushdown.slt [#20236](https://github.com/apache/datafusion/pull/20236) (adriangb) - Add Expr::Alias passthrough to Expr::placement() [#20237](https://github.com/apache/datafusion/pull/20237) (adriangb) -- Make PushDownFilter and CommonSubexprEliminate aware of Expr::placement [#20239](https://github.com/apache/datafusion/pull/20239) (adriangb) +- Make PushDownFilter and CommonSubexprEliminate aware of Expr::placement [#20239](https://github.com/apache/datafusion/pull/20239) (adriangb) - Refactor example metadata parsing utilities(#20204) [#20233](https://github.com/apache/datafusion/pull/20233) (cj-zhukov) - add module structure and unit tests for expression pushdown logical optimizer [#20238](https://github.com/apache/datafusion/pull/20238) (adriangb) - repro and disable dyn filter for preserve file partitions [#20175](https://github.com/apache/datafusion/pull/20175) (gene-bordegaray) @@ -486,6 +489,31 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Add deterministic per-file timing summary to sqllogictest runner [#20569](https://github.com/apache/datafusion/pull/20569) (kosiew) - chore: Enable workspace lint for all workspace members [#20577](https://github.com/apache/datafusion/pull/20577) (neilconway) - Fix serde of window lead/lag defaults [#20608](https://github.com/apache/datafusion/pull/20608) (avantgardnerio) +- [branch-53] fix: make the `sql` feature truly optional (#20625) [#20680](https://github.com/apache/datafusion/pull/20680) (comphead) +- [53] fix: Fix bug in `array_has` scalar path with sliced arrays (#20677) [#20700](https://github.com/apache/datafusion/pull/20700) (neilconway) +- [branch-53] fix: Return `probe_side.len()` for RightMark/Anti count(*) queries (#… [#20726](https://github.com/apache/datafusion/pull/20726) (jonathanc-n) +- [branch-53] FFI_TableOptions are using default values only [#20722](https://github.com/apache/datafusion/pull/20722) (timsaucer) +- chore(deps): pin substrait to `0.62.2` [#20827](https://github.com/apache/datafusion/pull/20827) (milenkovicm) +- chore(deps): pin substrait version [#20848](https://github.com/apache/datafusion/pull/20848) (milenkovicm) +- [branch-53] Fix repartition from dropping data when spilling (#20672) [#20792](https://github.com/apache/datafusion/pull/20792) (hareshkh) +- [branch-53] fix: `HashJoin` panic with String dictionary keys (don't flatten keys) (#20505) [#20791](https://github.com/apache/datafusion/pull/20791) (hareshkh) +- [branch-53] cli: Fix datafusion-cli hint edge cases (#20609) [#20887](https://github.com/apache/datafusion/pull/20887) (alamb) +- [branch-53] perf: Optimize `to_char` to allocate less, fix NULL handling (#20635) [#20885](https://github.com/apache/datafusion/pull/20885) (alamb) +- [branch-53] fix: interval analysis error when have two filterexec that inner filter proves zero selectivity (#20743) [#20882](https://github.com/apache/datafusion/pull/20882) (alamb) +- [branch-53] correct parquet leaf index mapping when schema contains struct cols (#20698) [#20884](https://github.com/apache/datafusion/pull/20884) (alamb) +- [branch-53] ser/de fetch in FilterExec (#20738) [#20883](https://github.com/apache/datafusion/pull/20883) (alamb) +- [branch-53] fix: use try_shrink instead of shrink in try_resize (#20424) [#20890](https://github.com/apache/datafusion/pull/20890) (alamb) +- [branch-53] Reattach parquet metadata cache after deserializing in datafusion-proto (#20574) [#20891](https://github.com/apache/datafusion/pull/20891) (alamb) +- [branch-53] fix: do not recompute hash join exec properties if not required (#20900) [#20903](https://github.com/apache/datafusion/pull/20903) (alamb) +- [branch-53] fix(spark): handle divide-by-zero in Spark `mod`/`pmod` with ANSI mode support (#20461) [#20896](https://github.com/apache/datafusion/pull/20896) (alamb) +- [branch-53] fix: Provide more generic API for the capacity limit parsing (#20372) [#20893](https://github.com/apache/datafusion/pull/20893) (alamb) +- [branch-53] fix: sqllogictest cannot convert to Substrait (#19739) [#20897](https://github.com/apache/datafusion/pull/20897) (alamb) +- [branch-53] Fix DELETE/UPDATE filter extraction when predicates are pushed down into TableScan (#19884) [#20898](https://github.com/apache/datafusion/pull/20898) (alamb) +- [branch-53] fix: preserve None projection semantics across FFI boundary in ForeignTableProvider::scan (#20393) [#20895](https://github.com/apache/datafusion/pull/20895) (alamb) +- [branch-53] Fix FilterExec converting Absent column stats to Exact(NULL) (#20391) [#20892](https://github.com/apache/datafusion/pull/20892) (alamb) +- [branch-53] backport: Support Spark `array_contains` builtin function (#20685) [#20914](https://github.com/apache/datafusion/pull/20914) (comphead) +- [branch-53] Fix duplicate group keys after hash aggregation spill (#20724) (#20858) [#20918](https://github.com/apache/datafusion/pull/20918) (alamb) +- [branch-53] fix: SanityCheckPlan error with window functions and NVL filter (#20231) [#20932](https://github.com/apache/datafusion/pull/20932) (alamb) ## Credits @@ -493,9 +521,9 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co ``` 73 dependabot[bot] - 35 Neil Conway + 43 Andrew Lamb + 36 Neil Conway 31 Kumar Ujjawal - 27 Andrew Lamb 26 Adrian Garcia Badaracco 21 Jeffrey Vo 13 cht42 @@ -503,26 +531,28 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co 10 kosiew 10 lyne 8 Nuno Faria + 8 Oleks V 7 Sergey Zhukov 7 xudong.w 6 Daniël Heres 5 Adam Gutglick 5 Gabriel - 5 Oleks V + 5 Jonathan Chen 4 Andy Grove 4 Dmitrii Blaginin 4 Huaijin 4 Jack Kleeman - 4 Jonathan Chen + 4 Tim Saucer 4 Yongting You 4 notashes 4 theirix 3 Eren Avsarogullari + 3 Haresh Khanna 3 Kazantsev Maksim 3 Kosta Tarasov 3 Liang-Chi Hsieh 3 Lía Adriana - 3 Tim Saucer + 3 Marko Milenković 3 Yu-Chuan Hung 3 dario curreri 3 feniljain @@ -565,11 +595,10 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co 1 Gene Bordegaray 1 Geoffrey Claude 1 Goksel Kabadayi - 1 Haresh Khanna 1 Heran Lin 1 Josh Elkind - 1 Marko Milenković 1 Mason + 1 Matt Butrovich 1 Mikhail Zabaluev 1 Mohit rao 1 Nathaniel J. Smith @@ -581,6 +610,7 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co 1 Raz Luvaton 1 Rosai 1 Ruihang Xia + 1 Samyak Sarnayak 1 Sergio Esteves 1 Simon Vandel Sillesen 1 Siyuan Huang @@ -600,3 +630,4 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co ``` Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release. + From c2250057b80775552fab138afcbf536d66125b1b Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Sun, 15 Mar 2026 07:17:00 -0400 Subject: [PATCH 2/3] prettier --- dev/changelog/53.0.0.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/dev/changelog/53.0.0.md b/dev/changelog/53.0.0.md index 88c4eae15477a..e1416c9a6b23b 100644 --- a/dev/changelog/53.0.0.md +++ b/dev/changelog/53.0.0.md @@ -33,7 +33,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Add `ScalarValue::RunEndEncoded` variant [#19895](https://github.com/apache/datafusion/pull/19895) (Jefffrey) - minor: remove unused crypto functions & narrow public API [#20045](https://github.com/apache/datafusion/pull/20045) (Jefffrey) - Wrap immutable plan parts into Arc (make creating `ExecutionPlan`s less costly) [#19893](https://github.com/apache/datafusion/pull/19893) (askalt) -- feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations [#19930](https://github.com/apache/datafusion/pull/19930) (mkleen) +- feat: Support planning subqueries with OuterReferenceColumn belongs to non-adjacent outer relations [#19930](https://github.com/apache/datafusion/pull/19930) (mkleen) - Remove the statistics() api in execution plan [#20319](https://github.com/apache/datafusion/pull/20319) (xudong963) - Remove recursive const check in `simplify_const_expr` [#20234](https://github.com/apache/datafusion/pull/20234) (AdamGS) - Cache `PlanProperties`, add fast-path for `with_new_children` [#19792](https://github.com/apache/datafusion/pull/19792) (askalt) @@ -193,7 +193,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - fix: Handle Utf8View and LargeUtf8 separators in concat_ws [#20361](https://github.com/apache/datafusion/pull/20361) (neilconway) - fix: HashJoin panic with dictionary-encoded columns in multi-key joins [#20441](https://github.com/apache/datafusion/pull/20441) (Tim-53) - fix: handle out of range errors in DATE_BIN instead of panicking [#20221](https://github.com/apache/datafusion/pull/20221) (mishop-15) -- fix: prevent duplicate alias collision with user-provided __datafusion_extracted names [#20432](https://github.com/apache/datafusion/pull/20432) (adriangb) +- fix: prevent duplicate alias collision with user-provided \_\_datafusion_extracted names [#20432](https://github.com/apache/datafusion/pull/20432) (adriangb) - fix: SortMergeJoin don't wait for all input before emitting [#20482](https://github.com/apache/datafusion/pull/20482) (rluvaton) - fix: `cardinality()` of an empty array should be zero [#20533](https://github.com/apache/datafusion/pull/20533) (neilconway) - fix: Unaccounted spill sort in row_hash [#20314](https://github.com/apache/datafusion/pull/20314) (EmilyMatt) @@ -232,7 +232,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Disallow positional struct casting when field names don’t overlap [#19955](https://github.com/apache/datafusion/pull/19955) (kosiew) - docs: fix docstring formatting [#20158](https://github.com/apache/datafusion/pull/20158) (Jefffrey) - Break upgrade guides into separate pages [#20183](https://github.com/apache/datafusion/pull/20183) (mishop-15) -- Better document the relationship between `FileFormat::projection` / `FileFormat::filter` and `FileScanConfig::Statistics` [#20188](https://github.com/apache/datafusion/pull/20188) (alamb) +- Better document the relationship between `FileFormat::projection` / `FileFormat::filter` and `FileScanConfig::Statistics` [#20188](https://github.com/apache/datafusion/pull/20188) (alamb) - Document the relationship between FileFormat::projection / FileFormat::filter and FileScanConfig::output_ordering [#20196](https://github.com/apache/datafusion/pull/20196) (alamb) - More documentation on `FileSource::table_schema` and `FileSource::projection` [#20242](https://github.com/apache/datafusion/pull/20242) (alamb) - chore(deps): bump setuptools from 80.10.2 to 82.0.0 in /docs [#20255](https://github.com/apache/datafusion/pull/20255) (dependabot[bot]) @@ -250,12 +250,13 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi **Other:** +- [branch-53] chore: Add branch protection (comphead) - Add a protection to release candidate branch 52 [#19660](https://github.com/apache/datafusion/pull/19660) (xudong963) - Downgrade aws-smithy-runtime, update `rust_decimal`, ignore RUSTSEC-2026-0001 to get clean CI [#19657](https://github.com/apache/datafusion/pull/19657) (alamb) - Update dependencies [#19667](https://github.com/apache/datafusion/pull/19667) (alamb) - Refactor PartitionedFile: add ordering field and new_from_meta constructor [#19596](https://github.com/apache/datafusion/pull/19596) (adriangb) - Remove coalesce batches rule and deprecate CoalesceBatchesExec [#19622](https://github.com/apache/datafusion/pull/19622) (feniljain) -- Perf: Optimize `substring_index` via single-byte fast path and direct indexing [#19590](https://github.com/apache/datafusion/pull/19590) (lyne7-sc) +- Perf: Optimize `substring_index` via single-byte fast path and direct indexing [#19590](https://github.com/apache/datafusion/pull/19590) (lyne7-sc) - refactor: Use `Signature::coercible` for isnan/iszero [#19604](https://github.com/apache/datafusion/pull/19604) (kumarUjjawal) - Parquet: Push down supported list predicates (array_has/any/all) during decoding [#19545](https://github.com/apache/datafusion/pull/19545) (kosiew) - Remove dependency on `rust_decimal`, remove ignore of `RUSTSEC-2026-0001` [#19666](https://github.com/apache/datafusion/pull/19666) (alamb) @@ -329,7 +330,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - chore(deps): bump chrono from 0.4.42 to 0.4.43 [#19897](https://github.com/apache/datafusion/pull/19897) (dependabot[bot]) - Improve error message when string functions receive Binary types [#19819](https://github.com/apache/datafusion/pull/19819) (lemorage) - Refactor ListArray hashing to consider only sliced values [#19500](https://github.com/apache/datafusion/pull/19500) (Jefffrey) -- feat(datafusion-spark): implement spark compatible `unhex` function [#19909](https://github.com/apache/datafusion/pull/19909) (lyne7-sc) +- feat(datafusion-spark): implement spark compatible `unhex` function [#19909](https://github.com/apache/datafusion/pull/19909) (lyne7-sc) - Support API for "pre-image" for pruning predicate evaluation [#19722](https://github.com/apache/datafusion/pull/19722) (sdf-jkl) - Support LargeUtf8 as partition column [#19942](https://github.com/apache/datafusion/pull/19942) (paleolimbot) - chore(deps): bump actions/checkout from 6.0.1 to 6.0.2 [#19953](https://github.com/apache/datafusion/pull/19953) (dependabot[bot]) @@ -351,7 +352,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - minor: Move metric `page_index_rows_pruned` to verbose level in `EXPLAIN ANALYZE` [#20026](https://github.com/apache/datafusion/pull/20026) (2010YOUY01) - Tweak `adapter serialization` example [#20035](https://github.com/apache/datafusion/pull/20035) (adriangb) - Simplify wait_complete function [#19937](https://github.com/apache/datafusion/pull/19937) (LiaCastaneda) -- [main] Update version to `52.1.0` (#19878) [#20028](https://github.com/apache/datafusion/pull/20028) (alamb) +- [main] Update version to `52.1.0` (#19878) [#20028](https://github.com/apache/datafusion/pull/20028) (alamb) - Fix/parquet opener page index policy [#19890](https://github.com/apache/datafusion/pull/19890) (aviralgarg05) - minor: add tests for coercible signature considering nulls/dicts/ree [#19459](https://github.com/apache/datafusion/pull/19459) (Jefffrey) - Enforce `clippy::allow_attributes` globally across workspace [#19576](https://github.com/apache/datafusion/pull/19576) (Jefffrey) @@ -402,7 +403,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Minor: verify plan output and unique field names [#20220](https://github.com/apache/datafusion/pull/20220) (alamb) - Add more tests to projection_pushdown.slt [#20236](https://github.com/apache/datafusion/pull/20236) (adriangb) - Add Expr::Alias passthrough to Expr::placement() [#20237](https://github.com/apache/datafusion/pull/20237) (adriangb) -- Make PushDownFilter and CommonSubexprEliminate aware of Expr::placement [#20239](https://github.com/apache/datafusion/pull/20239) (adriangb) +- Make PushDownFilter and CommonSubexprEliminate aware of Expr::placement [#20239](https://github.com/apache/datafusion/pull/20239) (adriangb) - Refactor example metadata parsing utilities(#20204) [#20233](https://github.com/apache/datafusion/pull/20233) (cj-zhukov) - add module structure and unit tests for expression pushdown logical optimizer [#20238](https://github.com/apache/datafusion/pull/20238) (adriangb) - repro and disable dyn filter for preserve file partitions [#20175](https://github.com/apache/datafusion/pull/20175) (gene-bordegaray) @@ -491,7 +492,7 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Fix serde of window lead/lag defaults [#20608](https://github.com/apache/datafusion/pull/20608) (avantgardnerio) - [branch-53] fix: make the `sql` feature truly optional (#20625) [#20680](https://github.com/apache/datafusion/pull/20680) (comphead) - [53] fix: Fix bug in `array_has` scalar path with sliced arrays (#20677) [#20700](https://github.com/apache/datafusion/pull/20700) (neilconway) -- [branch-53] fix: Return `probe_side.len()` for RightMark/Anti count(*) queries (#… [#20726](https://github.com/apache/datafusion/pull/20726) (jonathanc-n) +- [branch-53] fix: Return `probe_side.len()` for RightMark/Anti count(\*) queries (#… [#20726](https://github.com/apache/datafusion/pull/20726) (jonathanc-n) - [branch-53] FFI_TableOptions are using default values only [#20722](https://github.com/apache/datafusion/pull/20722) (timsaucer) - chore(deps): pin substrait to `0.62.2` [#20827](https://github.com/apache/datafusion/pull/20827) (milenkovicm) - chore(deps): pin substrait version [#20848](https://github.com/apache/datafusion/pull/20848) (milenkovicm) @@ -630,4 +631,3 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co ``` Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release. - From 37f5a5eacf932d0a9fe58944f81fb93d4033826a Mon Sep 17 00:00:00 2001 From: Andrew Lamb Date: Sun, 15 Mar 2026 07:36:05 -0400 Subject: [PATCH 3/3] Update attribution for backports --- dev/changelog/53.0.0.md | 99 ++++++++++++++++++++++------------------- 1 file changed, 53 insertions(+), 46 deletions(-) diff --git a/dev/changelog/53.0.0.md b/dev/changelog/53.0.0.md index e1416c9a6b23b..11820f3caad7f 100644 --- a/dev/changelog/53.0.0.md +++ b/dev/changelog/53.0.0.md @@ -19,7 +19,7 @@ under the License. # Apache DataFusion 53.0.0 Changelog -This release consists of 475 commits from 107 contributors. See credits at the end of this changelog for more information. +This release consists of 475 commits from 114 contributors. See credits at the end of this changelog for more information. See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgrading.html) for information on how to upgrade from previous versions. @@ -490,31 +490,31 @@ See the [upgrade guide](https://datafusion.apache.org/library-user-guide/upgradi - Add deterministic per-file timing summary to sqllogictest runner [#20569](https://github.com/apache/datafusion/pull/20569) (kosiew) - chore: Enable workspace lint for all workspace members [#20577](https://github.com/apache/datafusion/pull/20577) (neilconway) - Fix serde of window lead/lag defaults [#20608](https://github.com/apache/datafusion/pull/20608) (avantgardnerio) -- [branch-53] fix: make the `sql` feature truly optional (#20625) [#20680](https://github.com/apache/datafusion/pull/20680) (comphead) +- [branch-53] fix: make the `sql` feature truly optional (#20625) [#20680](https://github.com/apache/datafusion/pull/20680) (linhr) - [53] fix: Fix bug in `array_has` scalar path with sliced arrays (#20677) [#20700](https://github.com/apache/datafusion/pull/20700) (neilconway) - [branch-53] fix: Return `probe_side.len()` for RightMark/Anti count(\*) queries (#… [#20726](https://github.com/apache/datafusion/pull/20726) (jonathanc-n) - [branch-53] FFI_TableOptions are using default values only [#20722](https://github.com/apache/datafusion/pull/20722) (timsaucer) - chore(deps): pin substrait to `0.62.2` [#20827](https://github.com/apache/datafusion/pull/20827) (milenkovicm) - chore(deps): pin substrait version [#20848](https://github.com/apache/datafusion/pull/20848) (milenkovicm) -- [branch-53] Fix repartition from dropping data when spilling (#20672) [#20792](https://github.com/apache/datafusion/pull/20792) (hareshkh) -- [branch-53] fix: `HashJoin` panic with String dictionary keys (don't flatten keys) (#20505) [#20791](https://github.com/apache/datafusion/pull/20791) (hareshkh) -- [branch-53] cli: Fix datafusion-cli hint edge cases (#20609) [#20887](https://github.com/apache/datafusion/pull/20887) (alamb) -- [branch-53] perf: Optimize `to_char` to allocate less, fix NULL handling (#20635) [#20885](https://github.com/apache/datafusion/pull/20885) (alamb) -- [branch-53] fix: interval analysis error when have two filterexec that inner filter proves zero selectivity (#20743) [#20882](https://github.com/apache/datafusion/pull/20882) (alamb) -- [branch-53] correct parquet leaf index mapping when schema contains struct cols (#20698) [#20884](https://github.com/apache/datafusion/pull/20884) (alamb) -- [branch-53] ser/de fetch in FilterExec (#20738) [#20883](https://github.com/apache/datafusion/pull/20883) (alamb) -- [branch-53] fix: use try_shrink instead of shrink in try_resize (#20424) [#20890](https://github.com/apache/datafusion/pull/20890) (alamb) -- [branch-53] Reattach parquet metadata cache after deserializing in datafusion-proto (#20574) [#20891](https://github.com/apache/datafusion/pull/20891) (alamb) -- [branch-53] fix: do not recompute hash join exec properties if not required (#20900) [#20903](https://github.com/apache/datafusion/pull/20903) (alamb) -- [branch-53] fix(spark): handle divide-by-zero in Spark `mod`/`pmod` with ANSI mode support (#20461) [#20896](https://github.com/apache/datafusion/pull/20896) (alamb) -- [branch-53] fix: Provide more generic API for the capacity limit parsing (#20372) [#20893](https://github.com/apache/datafusion/pull/20893) (alamb) -- [branch-53] fix: sqllogictest cannot convert to Substrait (#19739) [#20897](https://github.com/apache/datafusion/pull/20897) (alamb) -- [branch-53] Fix DELETE/UPDATE filter extraction when predicates are pushed down into TableScan (#19884) [#20898](https://github.com/apache/datafusion/pull/20898) (alamb) -- [branch-53] fix: preserve None projection semantics across FFI boundary in ForeignTableProvider::scan (#20393) [#20895](https://github.com/apache/datafusion/pull/20895) (alamb) -- [branch-53] Fix FilterExec converting Absent column stats to Exact(NULL) (#20391) [#20892](https://github.com/apache/datafusion/pull/20892) (alamb) +- [branch-53] Fix repartition from dropping data when spilling (#20672) [#20792](https://github.com/apache/datafusion/pull/20792) (xanderbailey) +- [branch-53] fix: `HashJoin` panic with String dictionary keys (don't flatten keys) (#20505) [#20791](https://github.com/apache/datafusion/pull/20791) (alamb) +- [branch-53] cli: Fix datafusion-cli hint edge cases (#20609) [#20887](https://github.com/apache/datafusion/pull/20887) (comphead) +- [branch-53] perf: Optimize `to_char` to allocate less, fix NULL handling (#20635) [#20885](https://github.com/apache/datafusion/pull/20885) (neilconway) +- [branch-53] fix: interval analysis error when have two filterexec that inner filter proves zero selectivity (#20743) [#20882](https://github.com/apache/datafusion/pull/20882) (haohuaijin) +- [branch-53] correct parquet leaf index mapping when schema contains struct cols (#20698) [#20884](https://github.com/apache/datafusion/pull/20884) (friendlymatthew) +- [branch-53] ser/de fetch in FilterExec (#20738) [#20883](https://github.com/apache/datafusion/pull/20883) (haohuaijin) +- [branch-53] fix: use try_shrink instead of shrink in try_resize (#20424) [#20890](https://github.com/apache/datafusion/pull/20890) (ariel-miculas) +- [branch-53] Reattach parquet metadata cache after deserializing in datafusion-proto (#20574) [#20891](https://github.com/apache/datafusion/pull/20891) (nathanb9) +- [branch-53] fix: do not recompute hash join exec properties if not required (#20900) [#20903](https://github.com/apache/datafusion/pull/20903) (askalt) +- [branch-53] fix(spark): handle divide-by-zero in Spark `mod`/`pmod` with ANSI mode support (#20461) [#20896](https://github.com/apache/datafusion/pull/20896) (davidlghellin) +- [branch-53] fix: Provide more generic API for the capacity limit parsing (#20372) [#20893](https://github.com/apache/datafusion/pull/20893) (erenavsarogullari) +- [branch-53] fix: sqllogictest cannot convert to Substrait (#19739) [#20897](https://github.com/apache/datafusion/pull/20897) (kumarUjjawal) +- [branch-53] Fix DELETE/UPDATE filter extraction when predicates are pushed down into TableScan (#19884) [#20898](https://github.com/apache/datafusion/pull/20898) (kosiew) +- [branch-53] fix: preserve None projection semantics across FFI boundary in ForeignTableProvider::scan (#20393) [#20895](https://github.com/apache/datafusion/pull/20895) (Kontinuation) +- [branch-53] Fix FilterExec converting Absent column stats to Exact(NULL) (#20391) [#20892](https://github.com/apache/datafusion/pull/20892) (fwojciec) - [branch-53] backport: Support Spark `array_contains` builtin function (#20685) [#20914](https://github.com/apache/datafusion/pull/20914) (comphead) -- [branch-53] Fix duplicate group keys after hash aggregation spill (#20724) (#20858) [#20918](https://github.com/apache/datafusion/pull/20918) (alamb) -- [branch-53] fix: SanityCheckPlan error with window functions and NVL filter (#20231) [#20932](https://github.com/apache/datafusion/pull/20932) (alamb) +- [branch-53] Fix duplicate group keys after hash aggregation spill (#20724) (#20858) [#20918](https://github.com/apache/datafusion/pull/20918) (gboucher90) +- [branch-53] fix: SanityCheckPlan error with window functions and NVL filter (#20231) [#20932](https://github.com/apache/datafusion/pull/20932) (EeshanBembi) ## Credits @@ -522,91 +522,106 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co ``` 73 dependabot[bot] - 43 Andrew Lamb - 36 Neil Conway - 31 Kumar Ujjawal + 37 Neil Conway + 32 Kumar Ujjawal + 28 Andrew Lamb 26 Adrian Garcia Badaracco 21 Jeffrey Vo 13 cht42 - 10 Albert Skalt - 10 kosiew + 11 Albert Skalt + 11 kosiew 10 lyne 8 Nuno Faria 8 Oleks V 7 Sergey Zhukov 7 xudong.w 6 Daniël Heres + 6 Huaijin 5 Adam Gutglick 5 Gabriel 5 Jonathan Chen 4 Andy Grove 4 Dmitrii Blaginin - 4 Huaijin + 4 Eren Avsarogullari 4 Jack Kleeman - 4 Tim Saucer - 4 Yongting You 4 notashes 4 theirix - 3 Eren Avsarogullari - 3 Haresh Khanna + 4 Tim Saucer + 4 Yongting You + 3 dario curreri + 3 feniljain 3 Kazantsev Maksim 3 Kosta Tarasov 3 Liang-Chi Hsieh 3 Lía Adriana 3 Marko Milenković - 3 Yu-Chuan Hung - 3 dario curreri - 3 feniljain 3 mishop-15 + 3 Yu-Chuan Hung 2 Acfboy 2 Alan Tang + 2 David López 2 Devanshu 2 Frederic Branczyk 2 Ganesh Patil + 2 Heran Lin + 2 jizezhang 2 Miao 2 Michael Kleen + 2 niebayes 2 Pepijn Van Eeckhoudt 2 Peter L 2 Subham Singhal 2 Tobias Schwarzinger 2 UBarney + 2 Xander 2 Yuvraj 2 Zhang Xiaofeng - 2 jizezhang - 2 niebayes 1 Andrea Bozzo 1 Andrew Kane 1 Anjali Choudhary 1 Anna-Rose Lescure + 1 Ariel Miculas-Trif 1 Aryan Anand 1 Aviral Garg 1 Bert Vermeiren 1 Brent Gardner 1 ChanTsune - 1 David López + 1 comphead + 1 danielhumanmod 1 Dewey Dunnington + 1 discord9 1 Divyansh Pratap Singh 1 Eesh Sagar Singh + 1 EeshanBembi 1 Emil Ernerfeldt 1 Emily Matheys 1 Eric Chang 1 Evangeli Silva + 1 Filip Wojciechowski 1 Filippo 1 Gabriel Ferraté 1 Gene Bordegaray 1 Geoffrey Claude 1 Goksel Kabadayi - 1 Heran Lin + 1 Guillaume Boucher + 1 Haresh Khanna + 1 hsiang-c + 1 iamthinh 1 Josh Elkind + 1 karuppuchamysuresh + 1 Kristin Cowalcijk 1 Mason 1 Matt Butrovich + 1 Matthew Kim 1 Mikhail Zabaluev 1 Mohit rao + 1 nathan 1 Nathaniel J. Smith 1 Nick 1 Oleg V. Kozlyuk 1 Paul J. Davis 1 Pierre Lacave + 1 pmallex 1 Qi Zhu 1 Raz Luvaton 1 Rosai @@ -618,16 +633,8 @@ Thank you to everyone who contributed to this release. Here is a breakdown of co 1 Tim-53 1 Tushar Das 1 Vignesh - 1 XL Liang - 1 Xander 1 Xiangpeng Hao - 1 comphead - 1 danielhumanmod - 1 discord9 - 1 hsiang-c - 1 iamthinh - 1 karuppuchamysuresh - 1 pmallex + 1 XL Liang ``` Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.