-
Notifications
You must be signed in to change notification settings - Fork 4k
support null data type in gandiva #10010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename pull request title in the following format? or See also: |
We can restore if we're going to have arm GHA runners again. Closes apache#10618 from kszucs/ARROW-13211 Authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com> Signed-off-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
…ost release script Closes apache#9322 from kszucs/python-post-release Authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com> Signed-off-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
Closes apache#10583 from ianmcook/ARROW-11675 Lead-authored-by: Antoine Pitrou <antoine@python.org> Co-authored-by: Ian Cook <ianmcook@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
… file Some Python versions have a bug where `signal.getsignal` creates a reference cycle holding execution frames alive (https://bugs.python.org/issue42248). This would cause excessive lifetimes of the PyArrow table returned by `read_csv`. Closes apache#10609 from pitrou/ARROW-13187-signal-refcycle Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
Closes apache#10586 from lidavidm/arrow-12716 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
Closes apache#10615 from pachadotdev/arrow12967v3 Lead-authored-by: Mauricio Vargas <mavargas11@uc.cl> Co-authored-by: Pachá <mvargas@dcc.uchile.cl> Signed-off-by: Ian Cook <ianmcook@gmail.com>
Closes apache#10620 from pitrou/ARROW-13134 Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
Closes apache#10596 from pitrou/ARROW-13104-unsafe-cast Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
Closes apache#10530 from lidavidm/arrow-13072 Lead-authored-by: David Li <li.davidm96@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
Add a bytes_read() to the StreamingReader interface so the progress of the stream can be determined easily and accurately by a user. Closes apache#10509 from n3world/ARROW-12996-stream_progress Lead-authored-by: Nate Clark <nate@neworld.us> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
…kernels This change adds a `Bitmap::VisitWordsAndWrite` method, that outputs the values of the visitor lambda function to a provided bitmap. Closes apache#10487 from nirandaperera/ARROW-13010 Authored-by: niranda perera <niranda.perera@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
…rAsync WriteFooterAsync is private, so the example doesn't compile. This method was probably public in an earlier version of the library. WriteEndAsync seems to be the proper replacement. Closes apache#10399 from royalstream/patch-1 Authored-by: Steven Burns <royalstream@hotmail.com> Signed-off-by: Eric Erhardt <eric.erhardt@microsoft.com>
Adds sin/cos/tan and their inverses. Checked variants check for what would be domain errors (this does not apply to atan/atan2). Closes apache#10544 from lidavidm/arrow-13095 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
…lize This is a draft of adding more utility methods to FunctionOptions. It's not fully implemented (it needs rebasing + serialization isn't implemented for most options, plus there are various TODOs scattered). But before I proceed further, I wanted to get some feedback. Some concerns I have: - I don't like adding protected methods to a struct, and it's inconsistent with how equality is implemented for other structs (via a visitor or otherwise centralized in a single location). However ARROW-8891 will require that we be able to define kernels - and presumably their options - in a separate shared library, so I don't think we can do much better than this. - But for (de)serialization, we'll still need some way to dynamically register the mapping between a type_name and the actual struct, so maybe this is a moot point. - I've exposed the fact that serialization uses StructScalars to support Expression - but maybe this is too much to commit to in the API? Closes apache#10511 from lidavidm/arrow-13025 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
…is wrong Closes apache#10561 from 0x0L/0x0L-patch-1 Authored-by: nullptr <3621629+0x0L@users.noreply.github.com> Signed-off-by: Eric Erhardt <eric.erhardt@microsoft.com>
Closes apache#10619 from bkietz/BindFunction-cython-utility Authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
So far this involved a lot of refactoring of Expressions to be compatible with ExecBatches. The next step is to add a ScanNode wrapping a ScannerBuilder Closes apache#10397 from bkietz/11930-Refactor-Dataset-scans-to Authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
Also ensure that the llvm-symbolizer path is correctly set, for useful tracebacks. Closes apache#10632 from pitrou/ARROW-13223-tsan-failures Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Antoine Pitrou <antoine@python.org>
… differently than other regions Added special case for us-east-1 in CreateBucket. Note: I'm not sure how to go about testing this. I don't think minio is going to have the same quirk. Closes apache#10637 from westonpace/bugfix/ARROW-13228--c-s3-createbucket-fails-because-aws-treats-us- Authored-by: Weston Pace <weston.pace@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
Closes apache#10639 from lidavidm/arrow-13234 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
Generate a signature for compute functions that better reflects the accepted arguments.
Example before:
```python
>>> pc.sum?
Signature: pc.sum(array, *, options=None, memory_pool=None, **kwargs)
Docstring:
Compute the sum of a numeric array.
[...]
```
Same example after:
```python
>>> ?pc.sum
Signature:
pc.sum(
array,
*,
memory_pool=None,
options=None,
skip_nulls=True,
min_count=1,
)
Docstring:
Compute the sum of a numeric array.
[...]
```
One caveat is that the individual options are not explicitly documented (yet):
```
Parameters
----------
array : Array-like
Argument to compute function
memory_pool : pyarrow.MemoryPool, optional
If not passed, will allocate memory from the default memory pool.
options : pyarrow.compute.ScalarAggregateOptions, optional
Parameters altering compute function semantics
**kwargs : optional
Parameters for ScalarAggregateOptions constructor. Either `options`
or `**kwargs` can be passed, but not both at the same time.
```
Closes apache#10581 from pitrou/ARROW-10316-wrapped-compute-func
Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Also fixes ArithmeticOptions being unbound. Closes apache#10640 from lidavidm/arrow-13235 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Antoine Pitrou <antoine@python.org>
…n instead of yml Closes apache#10572 from kszucs/ARROW-6513 Authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com> Signed-off-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
The JNI build gets stopped due to build timeout. Seems like the docker cache isn't valid anymore so it must build the docker image as well, but doesn't have the opportunity to push at the and of the build. Closes apache#10631 from kszucs/jni-build-timeout Authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com> Signed-off-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
Closes apache#10641 from lidavidm/arrow-13236 Authored-by: David Li <li.davidm96@gmail.com> Signed-off-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
…heels With [configuration](https://github.com/ursacomputing/crossbow/blob/master/.github/workflows/cache_vcpkg.yml) on crossbow's main branch. Posting the results once the build are finished. Closes apache#10635 from kszucs/gha-vcpkg-cache Authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com> Signed-off-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
Closes apache#10626 from kou/cpp-pc-libs-private Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Remove APIs that have been deprecated for long enough. Closes apache#10868 from pitrou/ARROW-13552-cpp-deprecated-apis Authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
This PR adds support for both scalar and group-by aggregation via dplyr::summarize(). Only the functions sum, any, and all are wired up. Followup issues (both bugs and features): * [C++] Aggregation nodes seem not to respect FunctionOptions, or else I'm not passing them in correctly (ARROW-13497) * [C++] ScanNode takes filter but doesn't filter (ARROW-13498) * [R] Aggregation on expression doesn't NSE correctly (ARROW-13499) * [R] Bindings for mean, var, sd aggregation (ARROW-13528) * [R] Bindings for count aggregation (ARROW-13501) * [R] Bindings for min/max aggregation (ARROW-13502) * [R] Handle summarize() with 0 arguments or no aggregate functions (ARROW-13543) * [R] Support .groups argument to summarize() (ARROW-13550) * [C++] MakeScalarAggregateNode and MakeGroupByNode have quite different function signatures, which makes working with the API confusing; GroupBy doesn't let you specify the names of the output columns (ARROW-13482) * [C++] Grouped aggregation functions all have to be invoked with a `hash_` prefix to the name, which seems unnecessary because you can't call a non-hash-aggregation function in GroupBy and you can't call a hash_ function in ScalarAggregate (ARROW-13451) Closes apache#10722 from nealrichardson/scalar-aggregate-node Lead-authored-by: Neal Richardson <neal.p.richardson@gmail.com> Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
A test to see if we can (for now) build r-debug before using it Closes apache#10849 from jonkeane/ARROW-13507-r-lto Authored-by: Jonathan Keane <jkeane@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Closes apache#10851 from thisisnic/ARROW-13519_noisy_docs Lead-authored-by: Nic <thisisnic@gmail.com> Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com> Co-authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Various updates to dataset.Rmd including: * separating out dense text chunks * rephrasing based on suggestions by Grammarly to simplify phrasing * rephrasing "we" to "you" Closes apache#10765 from thisisnic/ARROW_13399_dataset_vignette Lead-authored-by: Nic Crane <thisisnic@gmail.com> Co-authored-by: Nic <thisisnic@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Create a from_pydict function in RecordBatch class. Create unit test for from_pydict Closes apache#10854 from kharoc/ARROW-13089 Authored-by: kharoc <kharoly.cs@gmail.com> Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
Also request the correct version of duckdb now that it's been released. Closes apache#10861 from jonkeane/ARROW-13538-gate-duckdb-tests Authored-by: Jonathan Keane <jkeane@gmail.com> Signed-off-by: Jonathan Keane <jkeane@gmail.com>
Closes apache#10873 from n3world/ARROW-13556_link_protobuf Authored-by: Nate Clark <nate@neworld.us> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Adds styling tasks to the Makefile (for 🦖 like me; I found that the styling-on-save from vscode was not reliable). Also makes codegen.R generate styled R code. Closes apache#10879 from nealrichardson/styler2 Lead-authored-by: Jonathan Keane <jkeane@gmail.com> Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com> Signed-off-by: Jonathan Keane <jkeane@gmail.com>
It reached EOL. Closes apache#10881 from kou/linux-ubuntu-drop-20.10 Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
…ents Update shared_ptr<Scalar> and shared_ptr<Arrow> to Datum in CheckScalar* functions Closes apache#10878 from diegodfrf/ARROW-12953-Refactor-CheckScalar-to-take-Datum-argum Authored-by: Fernando Rodriguez <diegodfrf@gmail.com> Signed-off-by: David Li <li.davidm96@gmail.com>
|
Hi there, this pr's been open for quite a while. Could someone help check it? Thanks a lot. |
|
Closed this one and created new one #10884 |
No description provided.