From 5a84b3588158b3d60ea903ccc232e3283236d29c Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 24 Sep 2024 15:54:39 -0600 Subject: [PATCH] Generate changelog for 0.3.0 release --- dev/changelog/0.3.0.md | 115 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100644 dev/changelog/0.3.0.md diff --git a/dev/changelog/0.3.0.md b/dev/changelog/0.3.0.md new file mode 100644 index 0000000000..039e1c7301 --- /dev/null +++ b/dev/changelog/0.3.0.md @@ -0,0 +1,115 @@ + + +# DataFusion Comet 0.3.0 Changelog + +This release consists of 57 commits from 12 contributors. See credits at the end of this changelog for more information. + +**Fixed bugs:** + +- fix: Support type coercion for ScalarUDFs [#865](https://github.com/apache/datafusion-comet/pull/865) (Kimahriman) +- fix: CometTakeOrderedAndProjectExec native scan node should use child operator's output [#896](https://github.com/apache/datafusion-comet/pull/896) (viirya) +- fix: Fix various memory leaks problems [#890](https://github.com/apache/datafusion-comet/pull/890) (Kontinuation) +- fix: Add output to Comet operators equal and hashCode [#902](https://github.com/apache/datafusion-comet/pull/902) (viirya) +- fix: Fallback to Spark when cannot resolve AttributeReference [#926](https://github.com/apache/datafusion-comet/pull/926) (viirya) +- fix: Fix memory bloat caused by holding too many unclosed `ArrowReaderIterator`s [#929](https://github.com/apache/datafusion-comet/pull/929) (Kontinuation) +- fix: Normalize NaN and zeros for floating number comparison [#953](https://github.com/apache/datafusion-comet/pull/953) (viirya) +- fix: window function range offset should be long instead of int [#733](https://github.com/apache/datafusion-comet/pull/733) (huaxingao) +- fix: CometScanExec on Spark 3.5.2 [#915](https://github.com/apache/datafusion-comet/pull/915) (Kimahriman) +- fix: div and rem by negative zero [#960](https://github.com/apache/datafusion-comet/pull/960) (kazuyukitanimura) + +**Performance related:** + +- perf: Optimize CometSparkToColumnar for columnar input [#892](https://github.com/apache/datafusion-comet/pull/892) (mbutrovich) +- perf: Fall back to Spark if query uses DPP with v1 data sources [#897](https://github.com/apache/datafusion-comet/pull/897) (andygrove) +- perf: Report accurate total time for scans [#916](https://github.com/apache/datafusion-comet/pull/916) (andygrove) +- perf: Add metric for time spent casting in native scan [#919](https://github.com/apache/datafusion-comet/pull/919) (andygrove) +- perf: Add criterion benchmark for aggregate expressions [#948](https://github.com/apache/datafusion-comet/pull/948) (andygrove) +- perf: Add metric for time spent in CometSparkToColumnarExec [#931](https://github.com/apache/datafusion-comet/pull/931) (mbutrovich) +- perf: Optimize decimal precision check in decimal aggregates (sum and avg) [#952](https://github.com/apache/datafusion-comet/pull/952) (andygrove) + +**Implemented enhancements:** + +- feat: Add config option to enable converting CSV to columnar [#871](https://github.com/apache/datafusion-comet/pull/871) (andygrove) +- feat: Implement basic version of string to float/double/decimal [#870](https://github.com/apache/datafusion-comet/pull/870) (andygrove) +- feat: Implement to_json for subset of types [#805](https://github.com/apache/datafusion-comet/pull/805) (andygrove) +- feat: Add ShuffleQueryStageExec to direct child node for CometBroadcastExchangeExec [#880](https://github.com/apache/datafusion-comet/pull/880) (viirya) +- feat: Support sort merge join with a join condition [#553](https://github.com/apache/datafusion-comet/pull/553) (viirya) +- feat: Array element extraction [#899](https://github.com/apache/datafusion-comet/pull/899) (Kimahriman) +- feat: date_add and date_sub functions [#910](https://github.com/apache/datafusion-comet/pull/910) (mbutrovich) +- feat: implement scripts for binary release build [#932](https://github.com/apache/datafusion-comet/pull/932) (parthchandra) +- feat: Publish artifacts to maven [#946](https://github.com/apache/datafusion-comet/pull/946) (parthchandra) + +**Documentation updates:** + +- doc: Documenting Helm chart for Comet Kube execution [#874](https://github.com/apache/datafusion-comet/pull/874) (comphead) +- doc: Update native code path in development [#921](https://github.com/apache/datafusion-comet/pull/921) (viirya) +- docs: Add more detailed architecture documentation [#922](https://github.com/apache/datafusion-comet/pull/922) (andygrove) + +**Other:** + +- chore: Update installation.md [#869](https://github.com/apache/datafusion-comet/pull/869) (haoxins) +- chore: Update versions to 0.3.0 / 0.3.0-SNAPSHOT [#868](https://github.com/apache/datafusion-comet/pull/868) (andygrove) +- chore: Add documentation on running benchmarks with Microk8s [#848](https://github.com/apache/datafusion-comet/pull/848) (andygrove) +- chore: Improve CometExchange metrics [#873](https://github.com/apache/datafusion-comet/pull/873) (viirya) +- chore: Add spilling metrics of SortMergeJoin [#878](https://github.com/apache/datafusion-comet/pull/878) (viirya) +- chore: change shuffle mode default from jvm to auto [#877](https://github.com/apache/datafusion-comet/pull/877) (andygrove) +- chore: Enable shuffle by default [#881](https://github.com/apache/datafusion-comet/pull/881) (andygrove) +- chore: print Comet native version to logs after Comet is initialized [#900](https://github.com/apache/datafusion-comet/pull/900) (SemyonSinchenko) +- chore: Revise batch pull approach to more follow C Data interface semantics [#893](https://github.com/apache/datafusion-comet/pull/893) (viirya) +- chore: Close dictionary provider when iterator is closed [#904](https://github.com/apache/datafusion-comet/pull/904) (andygrove) +- chore: Remove unused function [#906](https://github.com/apache/datafusion-comet/pull/906) (viirya) +- chore: Upgrade to latest DataFusion revision [#909](https://github.com/apache/datafusion-comet/pull/909) (andygrove) +- build: fix build [#917](https://github.com/apache/datafusion-comet/pull/917) (andygrove) +- chore: Revise array import to more follow C Data Interface semantics [#905](https://github.com/apache/datafusion-comet/pull/905) (viirya) +- chore: Address reviews [#920](https://github.com/apache/datafusion-comet/pull/920) (viirya) +- chore: Enable Comet shuffle for Spark core-1 test [#924](https://github.com/apache/datafusion-comet/pull/924) (viirya) +- build: Add maven-compiler-plugin for java cross-build [#911](https://github.com/apache/datafusion-comet/pull/911) (viirya) +- build: Disable upload-test-reports for macos-13 runner [#933](https://github.com/apache/datafusion-comet/pull/933) (viirya) +- minor: cast timestamp test #468 [#923](https://github.com/apache/datafusion-comet/pull/923) (himadripal) +- build: Set Java version arg for scala-maven-plugin [#934](https://github.com/apache/datafusion-comet/pull/934) (viirya) +- chore: Remove redundant RowToColumnar from CometShuffleExchangeExec for columnar shuffle [#944](https://github.com/apache/datafusion-comet/pull/944) (viirya) +- minor: rename CometMetricNode `add` to `set` and update documentation [#940](https://github.com/apache/datafusion-comet/pull/940) (andygrove) +- chore: Add config for enabling SMJ with join condition [#937](https://github.com/apache/datafusion-comet/pull/937) (andygrove) +- chore: Change maven group ID to `org.apache.datafusion` [#941](https://github.com/apache/datafusion-comet/pull/941) (andygrove) +- chore: Upgrade to DataFusion 42.0.0 [#945](https://github.com/apache/datafusion-comet/pull/945) (andygrove) +- build: Fix regression in jar packaging [#950](https://github.com/apache/datafusion-comet/pull/950) (andygrove) +- chore: Show reason for falling back to Spark when SMJ with join condition is not enabled [#956](https://github.com/apache/datafusion-comet/pull/956) (andygrove) +- chore: clarify tarball installation [#959](https://github.com/apache/datafusion-comet/pull/959) (comphead) + +## Credits + +Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor. + +``` + 22 Andy Grove + 18 Liang-Chi Hsieh + 3 Adam Binford + 3 Matt Butrovich + 2 Kristin Cowalcijk + 2 Oleks V + 2 Parth Chandra + 1 Himadri Pal + 1 Huaxin Gao + 1 KAZUYUKI TANIMURA + 1 Semyon + 1 Xin Hao +``` + +Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.