-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF #3317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF #3317
Conversation
f9622cd to
a610bf7
Compare
a610bf7 to
266c500
Compare
...ore/metastore-server/src/main/java/org/apache/hadoop/hive/common/histogram/kll/KllUtils.java
Show resolved
Hide resolved
...tore-server/src/main/java/org/apache/hadoop/hive/common/histogram/KllHistogramEstimator.java
Show resolved
Hide resolved
...ore/metastore-server/src/main/java/org/apache/hadoop/hive/common/histogram/kll/KllUtils.java
Show resolved
Hide resolved
...rver/src/main/java/org/apache/hadoop/hive/common/histogram/KllHistogramEstimatorFactory.java
Outdated
Show resolved
Hide resolved
ql/src/gen/vectorization/UDAFTemplates/VectorUDAFComputeKLL.txt
Outdated
Show resolved
Hide resolved
|
Before going into the single discussions, the general answer to all the above comments boils down to "I am trying to keep consistency with what was done here for vectorizing HyperLogLog function": https://github.com/apache/hive/pull/1824/files I sense that you don't like how that PR was designed, but since they are very close in spirit, and that their code is used side by side, I thought it was important to keep them consistent. If we need to rework the current PR, they won't match anymore, unless we rework the HLL design and implementation too, and this has its own share of cons... Assuming we go for the refactoring, most of the comments are too sketchy to give appropriate guidance over an alternative design/implementation, I will need to ask you to elaborate more on them. For instance, you seem to be suggesting to remove all helper classes/methods etc. Since it does not seem feasible to inline all the code now sitting in the helper methods/classes directly in the vectorized implementation, I guess you want to place it someplace else, but I can't really decide based on your comment. For the couple of currently unused methods, I will need them in a PR depending on this one: https://issues.apache.org/jira/browse/HIVE-26221: I can remove them now and re-introduce them later, if preferable. Once again they mimic HLL methods (both naming and usage, since HLL and KLL methods will be used side by side in most places, it helps reading what's happening, see LongColumnStatsAggregator.java#L104-L111, for instance). |
|
PR-1824 has nothing to do with datasketches; I don't know how you followed it's conventions but you might end up in trouble...because DS also has a HLL implementation...wouldn't that conflict with the existing one? note: PR-1824 named the file I think that the file name The current implementation doesn't really look forward: I think we have 20 sketch function from datasketches already exposed as inside Hive which could be vectorized; I think they are behind the same api cover...so just vectorizing the KLL one without any sight forward and taking "ideas" from the old hll codepath doesn't seem the best idea to me... no need to do everything in 1 patch - but this is pretty much just copy-pasting the existing hll txtfile substituted to kll here and there...so we should do that 20 times?
HIVE-26221 is something which have changes - but has no real end-user accessible value - and as such I don't think its ready. |
a2404e3 to
e2f02ea
Compare
|
Rebasing on master |
|
Kudos, SonarCloud Quality Gate passed! |
|
@deniskuzZ tests are green, ready to be merged when you have a moment, thanks! |
… (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317
… (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 (cherry picked from commit ad19ec3)
Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug
Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug
Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug
Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug
… (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317
Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug
Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug
Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug Remove uneccessary constructor Upgrading mockito in hive-exec
# This is the 1st commit message: Start to refactor Comment out classes to work on to be able to build the module Refator TestCopyUtils.java Before refactor: TestReplDumpTask Refactor TestReplDumpTask.java Before refactor: TestAtlasLoadTask Refactor TestAtlasLoadTask.java Refator TestRangerDumpTask.java Before refactor: TestCompactionHeartbeatService Refactor TestCompactionHeartbeatService.java Before refactor: TestRetryable Refactor TestRetryable.java Before refactor: TestRangerLoadTask Refactor: RangerLoadTask Before refactor: TestAtlasDumpTask Refactor: AtlasDumpTask Before refactor: TestPrimaryToReplicaResourceFunction Refactor TestPrimaryToReplicaResourceFunction Before refactor: TestExportService Refactor: TestExportService HIVE-26522: Added test for HIVE-22033 regarding delegation token renewal (apache#3585) HIVE-26676: Count distinct in subquery returning wrong results (Steve Carlin, reviewed by Alessandro Solimando, Aman Sinha, Krisztian Kasa) HIVE-26736: Authorization failure for nested Views having WITH clause. (apache#3760). (Ayush Saxena, reviewed by Denys Kuzmenko) HIVE-26628: Iceberg table is created when running explain ctas command (Krisztian Kasa, reviewed by Denys Kuzmenko) HIVE-26734: Iceberg: Add an option to allow positional delete files without actual row data. (apache#3758). (Ayush Saxena, reviewed by Adam Szita, Denys Kuzmenko) HIVE-26524: Use Calcite to remove sections of a query plan known never produces rows - ADDENDUM (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26740: HS2 makes direct connections to HMS backend DB due to Compaction/StatsUpdater (apache#3765) (Adam Szita, reviewed by Zhihua Deng) HIVE-26631: Remove unused Thrift config parameters login.timeout and exponential.backoff.slot.length (xiuzhu9527 reviewed by Stamatis Zampetakis) Closes apache#3672 HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) HIVE-26747: Remove implementor from HiveRelNode (Krisztian Kasa, reviewed by Stamatis Zampetakis) ADDENDUM HIVE-26745: HPL unable to handle Decimal or null values in hplsql mode (apache#3769) (Adam Szita, reviewed by Attila Magyar and Denys Kuzmenko) HIVE-26722: HiveFilterSetOpTransposeRule incorrectly prunes UNION ALL operands. (apache#3748). (Alessandro Solimando, reviewed by Ayush Saxena, Simhadri Govindappa) HIVE-26746: Request tracking: change to X-Request-ID header (apache#3770) (Laszlo Bodor reviewed by Zhihua Deng) HIVE-26624: Set repl.background.enable on target after failover completion (Vinit Patni, reviewed by László Pintér, Teddy Choi) Co-authored-by: vpatni <vpatni@cloudera.com> HIVE-26712: HCatMapReduceTest writes test files in project base directory instead of build directory. (apache#3738) (Chris Nauroth reviewed by Ayush Saxena) HIVE-26726: Tinyint column with windowing fn crashes at runtime (Steve Carlin, reviewed by Aman Sinha, Krisztian Kasa) HIVE-26680: Make CMV use Direct Insert Semantics (Sourabh Badhya, reviewed by Denys Kuzmenko, Laszlo Vegh) Closes apache#3715 HIVE-26243: Add vectorized implementation of the 'ds_kll_sketch' UDAF (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317 HIVE-26761: Add result sorting to complex_alias.q (apache#3783) (Balazs Cseh reviewed by Laszlo Bodor) HIVE-26759: Update SHOW COMPACTIONS query to support Postgres HMS (Akshat Mathur, reviewed by Denys Kuzmenko, Zsolt Miskolczi) Closes 3782 HIVE-26765: Hive Ranger URL policy for insert overwrite directory denies access when fully qualified paths are passed (apache#3790) (Simhadri Govindappa, reviewed by Adam Szita) Small refactors Fix bug # This is the commit message apache#2: Remove uneccessary constructor
… (Alessandro Solimando, reviewed by Denys Kuzmenko, Zoltan Haindrich) Closes apache#3317








What changes were proposed in this pull request?
We add a vectorized implementation for the
ds_kll_sketchUDAFWhy are the changes needed?
When this UDAF is used either alone or at the side of other vectorizable functions, it will benefit from a performance speed-up.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
mvn test -Dtest=TestMiniLlapLocalCliDriver -Dqfile="compute_kll_sketch.q" -Dtest.output.overwrite -pl itests/qtest -Pitestsand