Skip to content

Conversation

@linxiaow
Copy link
Contributor

@linxiaow linxiaow commented Apr 4, 2020

I added "not self._mask.any()" to check if there is no missing value for both .any() and .all() when skipna is False. It looks fine for the example you provided.
any_all

@WillAyd
Copy link
Member

WillAyd commented Apr 4, 2020

Thanks for the PR - does this close a particular issue? Can you add tests?

@linxiaow
Copy link
Contributor Author

linxiaow commented Apr 4, 2020

@WillAyd I think yes, and I intend to close the issue here #33253. It is my first time in open source so I am not familiar with PR.

I am not familiar with testing, but I can have a try. Any suggestions before I get started

@jorisvandenbossche
Copy link
Member

Don't worry about Travis that is failing (that's a general issue, there is already another PR trying to fix that)

@jorisvandenbossche
Copy link
Member

The any/all tests are here:

@pytest.mark.parametrize(
"values, exp_any, exp_all, exp_any_noskip, exp_all_noskip",
[
([True, pd.NA], True, True, True, pd.NA),
([False, pd.NA], False, False, pd.NA, False),
([pd.NA], False, True, pd.NA, pd.NA),
([], False, True, False, True),
],
)
def test_any_all(values, exp_any, exp_all, exp_any_noskip, exp_all_noskip):
# the methods return numpy scalars
exp_any = pd.NA if exp_any is pd.NA else np.bool_(exp_any)
exp_all = pd.NA if exp_all is pd.NA else np.bool_(exp_all)
exp_any_noskip = pd.NA if exp_any_noskip is pd.NA else np.bool_(exp_any_noskip)
exp_all_noskip = pd.NA if exp_all_noskip is pd.NA else np.bool_(exp_all_noskip)
for con in [pd.array, pd.Series]:
a = con(values, dtype="boolean")
assert a.any() is exp_any
assert a.all() is exp_all
assert a.any(skipna=False) is exp_any_noskip
assert a.all(skipna=False) is exp_all_noskip
assert np.any(a.any()) is exp_any
assert np.all(a.all()) is exp_all

so you can add to that test some new cases for the all-True and all-False cases from the issue.

@jreback jreback added Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Apr 5, 2020
@jreback jreback added this to the 1.1 milestone Apr 5, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add a whatsnew note in bug fixes Missing section for 1.1

can you also add a test case in panda/tests/reductions/test_reductions.py

def test_all_any_params(self) is the relevant test

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@linxiaow thanks for the updates. I suggested some rewording on the added whatsnew/comment

@linxiaow linxiaow requested a review from jreback April 9, 2020 01:15
@jorisvandenbossche jorisvandenbossche merged commit 20474d5 into pandas-dev:master Apr 10, 2020
@jorisvandenbossche
Copy link
Member

Thanks @linxiaow !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: BooleanArray.any with all False values and skipna=False is buggy

4 participants