Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Feb 28, 2018

No description provided.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason (macro expansion?) these #ifs wouldn't work correctly here, even though NPY_INT64 is defined to NPY_LONG.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, actually, that must be because NPY_LONGLONG is not a macro...

@pitrou pitrou force-pushed the ARROW-2135-nan-conversion-when-casting branch 2 times, most recently from 6cbf133 to d602be7 Compare February 28, 2018 19:13
@pitrou pitrou closed this Feb 28, 2018
@pitrou pitrou reopened this Feb 28, 2018
@pitrou pitrou force-pushed the ARROW-2135-nan-conversion-when-casting branch 3 times, most recently from 84766bc to cd37393 Compare February 28, 2018 20:01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::fill(null_bitmap_data_, null_bitmap_data_ + null_bytes, 0) is a bit more idiomatic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, perhaps. This is really a copy/paste of NumPyConverter::InitNullBitmap()...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly time for a subclass then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there already a test for things like a = [1.0, 2.0, 3.1, np.nan] where a user passes in an integer type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean for the truncation behavior? Let me look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think so. I'm not sure we specify the truncation mode anywhere either?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's a hard cast:

In [7]: pa.array([1, 2, 3.190, np.nan], type=pa.int64())
Out[6]:
<pyarrow.lib.Int64Array object at 0x7f537e42dd68>
[
  1,
  2,
  3,
  NA
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. Was just wondering.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inline is redundant here: http://en.cppreference.com/w/cpp/language/inline.

A function defined entirely inside a class/struct/union definition, whether it's a member function or a non-member friend function, is implicitly an inline function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. This is really using the same convention as the rest of the file, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so that's also called isnull. Shouldn't that mean v == Py_None?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably needs a test as well since it isn't failing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch :-) I'm not sure how to test it. Defining isnull is necessary for compiling, but that path isn't taken at runtime as object arrays are handled separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point we may want to have an STL-compatible view class that makes interacting with iterators constructs in the STL much easier. We have a lot of code that is manually handling iteration using a size/count and a buffer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which iterators are you thinking about? Do you mean the ndarray 1d iterator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one, though I added begin()/end() for that in #1651.

@pitrou pitrou force-pushed the ARROW-2135-nan-conversion-when-casting branch 3 times, most recently from 73916de to bb56637 Compare March 1, 2018 09:56
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I don't know what that is, but this is required to have the tests pass. Why do we always treat NaT as null but not floating-point NaN? @wesm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIU There's no other way to interpret NaT other than NULL (unless there's a standard that defines it in a different way than "missing"). nan is part of the IEEE floating point specification (as I'm sure you know) and it has a different meaning than null.

@pitrou
Copy link
Member Author

pitrou commented Mar 1, 2018

I addressed some review comments now.

@pitrou pitrou force-pushed the ARROW-2135-nan-conversion-when-casting branch from bb56637 to 375418f Compare March 1, 2018 12:23
@pitrou
Copy link
Member Author

pitrou commented Mar 1, 2018

@pitrou pitrou force-pushed the ARROW-2135-nan-conversion-when-casting branch from 375418f to 0af573b Compare March 5, 2018 11:41
@pitrou pitrou force-pushed the ARROW-2135-nan-conversion-when-casting branch from 0af573b to 939428d Compare March 8, 2018 12:33
@pitrou
Copy link
Member Author

pitrou commented Mar 8, 2018

Rebased.

@pitrou
Copy link
Member Author

pitrou commented Mar 8, 2018

AppVeyor at https://ci.appveyor.com/project/pitrou/arrow/build/1.0.175

Copy link
Member

@wesm wesm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thanks for cleaning up the int/uint size issues here, much cleaner now

@wesm wesm closed this in 171340f Mar 12, 2018
@pitrou pitrou deleted the ARROW-2135-nan-conversion-when-casting branch March 12, 2018 19:04
@wesm
Copy link
Member

wesm commented Mar 12, 2018

see ARROW-2298 for adding an option about NaN conversions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants