-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-17362: [R] Implement dplyr::across() inside summarise() #14042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
paleolimbot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! I did a local checkout, too and it worked like a charm:
Details
library(arrow, warn.conflicts = FALSE)
#> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.
library(dplyr, warn.conflicts = FALSE)
arrow_table(ggplot2::mpg) |>
group_by(manufacturer) |>
summarise(across(c(cty, hwy), mean)) |>
collect()
#> # A tibble: 15 × 3
#> manufacturer cty hwy
#> <chr> <dbl> <dbl>
#> 1 audi 17.6 26.4
#> 2 chevrolet 15 21.9
#> 3 dodge 13.1 17.9
#> 4 ford 14 19.4
#> 5 honda 24.4 32.6
#> 6 hyundai 18.6 26.9
#> 7 jeep 13.5 17.6
#> 8 land rover 11.5 16.5
#> 9 lincoln 11.3 17
#> 10 mercury 13.2 18
#> 11 nissan 18.1 24.6
#> 12 pontiac 17 26.4
#> 13 subaru 19.3 25.6
#> 14 toyota 18.5 24.9
#> 15 volkswagen 20.9 29.2
ggplot2::mpg |>
group_by(manufacturer) |>
summarise(across(c(cty, hwy), mean)) |>
collect()
#> # A tibble: 15 × 3
#> manufacturer cty hwy
#> <chr> <dbl> <dbl>
#> 1 audi 17.6 26.4
#> 2 chevrolet 15 21.9
#> 3 dodge 13.1 17.9
#> 4 ford 14 19.4
#> 5 honda 24.4 32.6
#> 6 hyundai 18.6 26.9
#> 7 jeep 13.5 17.6
#> 8 land rover 11.5 16.5
#> 9 lincoln 11.3 17
#> 10 mercury 13.2 18
#> 11 nissan 18.1 24.6
#> 12 pontiac 17 26.4
#> 13 subaru 19.3 25.6
#> 14 toyota 18.5 24.9
#> 15 volkswagen 20.9 29.2Created on 2022-09-08 by the reprex package (v2.0.1)
|
Benchmark runs are scheduled for baseline = d8571a4 and contender = 75d2bfd. 75d2bfd is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
…#14042) This PR implements `across()` within `summarise()`. It also adds the `.groups` argument explicitly to the `summarise()` function signature instead of being passed in via `...` (this was necessary to prevent test failures after the addition of `expand_across()` to `summarise()`). Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Nic Crane <thisisnic@gmail.com>
This PR implements
across()withinsummarise(). It also adds the.groupsargument explicitly to thesummarise()function signature instead of being passed in via...(this was necessary to prevent test failures after the addition ofexpand_across()tosummarise()).