Skip to content

Conversation

@cpcloud
Copy link
Contributor

@cpcloud cpcloud commented Feb 14, 2018

No description provided.

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 15, 2018

Restarted the travis jobs, they seemed to be failing during apt-get

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually required? You are only using the decimal type below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misunderstanding the use case for OwnedRefNoGIL.

My understanding is that if we're Py_XDECREFing something and we do not hold then GIL then we need to acquire it.

The GIL is released in Cython before the function that instantiates this class is called (and isn't subsequently acquired in that function). Assuming my understanding is correct then we need to acquire the GIL before calling Py_XDECREF on this.

Are you suggesting that because the order of destruction here is guaranteed to be the reverse of initialization order that we don't need to care that decimal_module_ holds the GIL because we'll never decref decimal_module_ before we decref decimal_type_?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that you don't care about keeping a reference to the decimal module if you only need to use the decimal type:

>>> D = __import__('decimal').Decimal
>>> sys.modules['decimal'] = None
>>> D(100)
Decimal('100')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, awesome. There are a few places that we have unnecessary refs to the decimal module. I'll open a JIRA to clean those up (and remove this one). Thanks for the review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work with Python Decimal nans, right? Try decimal.Decimal('nan').

Copy link
Member

@pitrou pitrou Feb 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the policy for adding tests here rather than in pyarrow/tests? It seems writing tests in pure Python is generally easier :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's funny you mention that. I agree, but I wanted to be able to step through C++ code in the CLion IDE so I wrote the test here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add Decimal('nan') to this list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wesm @pitrou is there any reason we shouldn't make these static global variables, so we don't have to call these APIs all over the place and inside of loops?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's fine to make it a static global (or a singleton, etc.), as long as we don't want to support subinterpreters perhaps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way the nested block above doesn't seem needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, okay I'll refactor and make those static

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm so it appears that string interning of module names doesn't play well with C++ globals. I'll leave these as they are for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, you probably can't use OwnedRef here since the destructor would trigger too late. But you could have a PyObject*. The only downside is that it would make the object eternal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the consts? I don't think it makes sense, and you're bound to do a lot of casts as soon as you call the Python C API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note PyObject is non-const pretty much by construction, as it has a reference count that can be mutated by any operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I was following convention in the file. I can adjust.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been removing all the instances of const PyObject* after having a compiler warning from an internal Py-API, so happy to see them all go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last argument can simply be the empty string AFAIR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should call PyDecimal_Check before calling the method. Also you should check if the method raises. And it's better to use PyObject_IsTrue rather than compare against Py_True.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not sure how I missed that :)

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 16, 2018

@pitrou any more comments here?

@cpcloud cpcloud force-pushed the ARROW-2157 branch 3 times, most recently from 40d9b52 to a910633 Compare February 18, 2018 01:58
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments below. Also it seems that the AppVeyor build has failed, though it may be unrelated?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This member doesn't seem used actually. Were you planning to use it with PyObject_IsInstance instead of the costlier call to internal::PyDecimal_Check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks. Will fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to add a comment motivating the algorithm here (why compare the absolute values but then memorize the original value)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, reconsidering this based on your comment this really should be just the max scale. Negative scale should contribute to precision only if it would increase precision. The goal here is to "cast the widest net", ie the max precision and max scale. Negative scale complicates things a tiny bit. I'll add some commentary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a C++ expert, so I'm curious why this is necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constructors are not inherited by default. I'm not actually using this, so it should cost nothing at runtime. If we ever wanted to construct one of these with a pointer as it's first argument we'd have to define it anyway. I can remove it if you'd like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem with me. I had forgotten about non-inheritance of constructors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can PandasObjectIsNull return true on a Decimal instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'll fix that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check that the Arrow type was inferred correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep will do.

@wesm
Copy link
Member

wesm commented Feb 21, 2018

needs rebase

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 23, 2018

Closed in favor of #1651

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants