Skip to content

[R] Possible regression in dev arrow #43627

@thisisnic

Description

@thisisnic

Describe the bug, including details regarding any error messages, version, and platform.

I just ran this with dev arrow and it took 45 seconds on my first run and 85 seconds on my second run:

library(arrow)
library(dplyr)
library(tictoc)

nyc_taxi <- open_dataset("data/nyc-taxi/")

tic()
nyc_taxi |>
  group_by(year) |>
  summarise(
    all_trips = n(),
    shared_trips = sum(passenger_count > 1, na.rm= TRUE)
  ) |>
  mutate(pct_shared = shared_trips / all_trips * 1) |>
  collect()
toc()

If I do it with 16.1.0 it only took 5 seconds on both runs.

Component(s)

R

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions