Skip to content

BUG: Cube equality boolean data#3250

Closed
cpelley wants to merge 1 commit intoSciTools:v2.2.xfrom
cpelley:CUBE_EQUALITY_BOOL_DATA
Closed

BUG: Cube equality boolean data#3250
cpelley wants to merge 1 commit intoSciTools:v2.2.xfrom
cpelley:CUBE_EQUALITY_BOOL_DATA

Conversation

@cpelley
Copy link

@cpelley cpelley commented Jan 11, 2019

Issue introduced in numpy 1.14 I think where subtracting two boolean arrays was deprecated.

  File "/path/to/iris/cube.py", line 3024, in __eq__
    result = np.all(np.abs(self.data - other.data) < 1e-8)
TypeError: numpy boolean subtract, the `-` operator, is deprecated, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

This may be better placed as a more general numpy comparison operator in iris but I don't see there is anywhere suitable for such a thing to live currently.

@SciTools-assistant SciTools-assistant added the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Jan 11, 2019
@cpelley cpelley force-pushed the CUBE_EQUALITY_BOOL_DATA branch from bf55550 to c3aeda6 Compare January 11, 2019 12:58
@SciTools-assistant SciTools-assistant removed the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Jan 11, 2019
@cpelley cpelley force-pushed the CUBE_EQUALITY_BOOL_DATA branch from c3aeda6 to 5c346a1 Compare January 11, 2019 13:00
@rcomer
Copy link
Member

rcomer commented Jan 11, 2019

Why not just use numpy.allclose?

@rcomer
Copy link
Member

rcomer commented Jan 12, 2019

Why not just use numpy.allclose?

Or even dask.array.allclose.

lib/iris/cube.py Outdated
if result:
result = np.all(np.abs(self.data - other.data) < 1e-8)

result = da.array.allclose(self.data, other.data)
Copy link
Member

@rcomer rcomer Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

da is already dask.array and the function returns a dask array. So I think you need:

da.allclose(self.core_data(), other.core_data()).compute()

(edited following @pp-mo's comment)

lib/iris/cube.py Outdated
@@ -3021,8 +3021,7 @@ def __eq__(self, other):
# having checked everything else, check approximate data
# equality - loading the data if has not already been loaded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're happy with using da.allclose, I guess this comment about loading the data can be removed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is using 'self.data', 'other.data' then of course it is still realising the cubes.

In fact I don't think we can use dask here, as the result needs to be a concrete boolean, whereas da.allclose returns a lazy scalar value, even when the args are concrete ...

>>> da.allclose(np.array([1,2]), np.array([1,1]))
dask.array<all-aggregate, shape=(), dtype=bool, chunksize=()>
>>> _.compute()
False
>>> 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is using 'self.data', 'other.data' then of course it is still realising the cubes.

Ah yes of course, you'd want to use core_data() or lazy_data()

Definitely in learning mode here...

Is there no advantage to using da.allclose with the compute call? I was thinking it would access the files in chunks and therefore avoid busting memory.

Aside: that thing you did with the underscore, I did not know was a thing 😮 .

Copy link
Member

@pp-mo pp-mo Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no advantage to using da.allclose with the compute call? I was thinking it would access the files in chunks and therefore avoid busting memory.

Actually that is a really good point 👍
It won't realise any more, which it used to, but I think we always view that as a good thing 😉

you'd want to use core_data()

I think it makes sense to use 'core_data' as it involves the minimum of extra work.

My key point was : the routine itself must still return a regular boolean, not a lazy value.
... but I see now you have just added the ".compute()" above ! 👍

the underscore

Yes, it's where the interpreter puts unassigned results (in addition to printing them).
Also very handy when you forgot to save the result of a calculation !

>>> long_and_complicated()
BigObject(...)
>>> keep_that = _

@cpelley cpelley force-pushed the CUBE_EQUALITY_BOOL_DATA branch 2 times, most recently from f8ab698 to cccd653 Compare January 15, 2019 10:23
@rcomer rcomer mentioned this pull request May 10, 2019
@bjlittle
Copy link
Member

@pp-mo Fancy taking this on?

@cpelley I think that you're going to have to re-base to fold in the latest Travis CI changes

@pp-mo
Copy link
Member

pp-mo commented Jul 10, 2019

Thanks @bjlittle
@cpelley I think this is good to go, if you can get a green tick on it.
🐴 🥕
🙏

@bjlittle bjlittle added this to the v2.3.0 milestone Jul 10, 2019
@cpelley cpelley force-pushed the CUBE_EQUALITY_BOOL_DATA branch 2 times, most recently from 891517c to 2135850 Compare July 17, 2019 08:34
@pp-mo pp-mo assigned stephenworsley and unassigned pp-mo Aug 20, 2019
@lbdreyer
Copy link
Member

@cpelley Sorry about this! The travis tests are failing but I believe the fix for those tests went in already. If you rebase we should be able to get a green tick.

Do you mind rebasing and then this can go in?

@pp-mo
Copy link
Member

pp-mo commented Sep 30, 2019

Hi @cpelley !
Jog : can you get this rebased in a day or so, and maybe add a whatsnew item (not essential: we can make something up) ?
Then it can go into Iris 2.3.

@cpelley cpelley force-pushed the CUBE_EQUALITY_BOOL_DATA branch from b53f4ba to 9e6a840 Compare October 1, 2019 11:44
@pp-mo
Copy link
Member

pp-mo commented Oct 2, 2019

#3431 replaced : closing.

@pp-mo pp-mo closed this Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants