Skip to content

Conversation

@fsaintjacques
Copy link
Contributor

@fsaintjacques fsaintjacques commented Apr 8, 2020

The Converter::Make did not like receiving empty ArrayVector. The bug was exposed in ARROW-8216 which could return an empty selection vector due to a randomly generated fixture in test-dplyr.R

@github-actions
Copy link

github-actions bot commented Apr 8, 2020

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not error: it should return a 0-row data.frame:

filter(mtcars,FALSE)
##  [1] mpg  cyl  disp hp   drat wt   qsec vs   am   gear carb
## <0 rows> (or 0-length row.names)

I'll pull the branch and try to special-case that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fsaintjacques I did this in f93f08c.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To save us future pain, should we hunt for other places that assume >0 length vectors? grepping [0] in r/src hits a few places that look similarly suspect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked them all and found one bug, I'll add a test and the fix in the current PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should accept the type of arrays explicitly; a Converter should not fail when arrays is empty. Instead the result of conversion should also be empty.

Copy link
Contributor Author

@fsaintjacques fsaintjacques Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated https://issues.apache.org/jira/browse/ARROW-7798 to include this. I tried to quickly refactor what you suggest but it bubbles to rewriting almost all methods/functions of the file.

Copy link
Contributor Author

@fsaintjacques fsaintjacques Apr 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also lead me to find a bug with handling Dictionary ChunkedArray, https://issues.apache.org/jira/browse/ARROW-8374

@nealrichardson
Copy link
Member

#6878 may help give more visibility into the unexpected failures

fsaintjacques and others added 3 commits April 8, 2020 15:29
The Converter::Make did not like receiving empty ArrayVector. The bug
was introduced in ARROW-8216 which could return an empty selection
vector due to a randomly generated fixture in test-dplyr.R
@wesm
Copy link
Member

wesm commented Apr 8, 2020

Hmm, continuing to have arrow::r::Converter crash on no chunks isn't too great. It seems that in most cases the first element is being used to obtain type metadata. This could be fixed in most cases from what I can see by passing the type as a parameter to Converter::Make. Thoughts?

@nealrichardson
Copy link
Member

At least now it errors instead of segfaults, and I changed some of the choreography to keep away from the error. Agreed that we can do better, and I think what you say has been added to the scope of https://issues.apache.org/jira/browse/ARROW-7798.

I'm going to merge this since we've at least fixed the issue and have a passing build.

@wesm
Copy link
Member

wesm commented Apr 8, 2020

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants