Skip to content

Conversation

@sinhrks
Copy link
Member

@sinhrks sinhrks commented Jul 23, 2016

.append / concat can handle ordered Categorical which has the identical categories. However, union_categorical raises TypeError. This PR allows union_categorical to handle the case.

on current mastser

s1 = pd.Series(pd.Categorical([1, 2, 3], ordered=True))
s2 = pd.Series(pd.Categorical([3, 2, 1], categories=[1, 2, 3], ordered=True))
s1.append(s2)

#0    1
#1    2
#2    3
#0    3
#1    2
#2    1
# dtype: category
# Categories (3, int64): [1 < 2 < 3]

pd.types.concat.union_categoricals([s1.values, s2.values])
# TypeError: Can only combine unordered Categoricals

CC: @JanSchulz, @chris-b1

@sinhrks sinhrks added Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode Categorical Categorical Data Type labels Jul 23, 2016
@sinhrks sinhrks added this to the 0.19.0 milestone Jul 23, 2016
@codecov-io
Copy link

codecov-io commented Jul 23, 2016

Current coverage is 85.33% (diff: 100%)

Merging #13763 into master will increase coverage by <.01%

@@             master     #13763   diff @@
==========================================
  Files           141        141          
  Lines         50703      50717    +14   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43267      43281    +14   
  Misses         7436       7436          
  Partials          0          0          

Powered by Codecov. Last update 1367945...9cadc4e

@sinhrks sinhrks force-pushed the union_categoricals_ordered branch from ead9cbb to 7540677 Compare July 23, 2016 13:15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has -> have

@chris-b1
Copy link
Contributor

Looks good to me, other than a couple nit-picky doc things.

@sinhrks sinhrks force-pushed the union_categoricals_ordered branch from 7540677 to 0971621 Compare July 23, 2016 13:24
@chris-b1
Copy link
Contributor

I created - sinhrks#1 to also add a sort_categories argument, which is needed for #13406

@jankatins
Copy link
Contributor

In my thinking union_categorical is a "unstricter" (and "uncleaner") variant of the normal categorical handling: where the usual cat handling errors, this should "bend the rules" and succeed. Following this, I thinking union_categorical should handle whatever append can take...

`union_categoricals` only works with unordered categoricals
and will raise if any are ordered.
`union_categoricals` only works with:
- unordered categoricals
Copy link
Contributor

@jankatins jankatins Jul 24, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this should read:

In addition to the "easy" case of combining two categoricals of the same categories and order information (e.g. what you could also append for), union_categoricals only works with unordered categoricals and will raise if any are ordered.

I.e. make it clear that union_categoricals supports something more than append.

@sinhrks sinhrks force-pushed the union_categoricals_ordered branch from 0971621 to 9cadc4e Compare July 25, 2016 22:49
@jreback
Copy link
Contributor

jreback commented Jul 29, 2016

@chris-b1 , @JanSchulz any final comments?

@chris-b1
Copy link
Contributor

No, looks good to me.

On Jul 28, 2016 7:32 PM, "Jeff Reback" notifications@github.com wrote:

@chris-b1 https://github.com/chris-b1 , @JanSchulz
https://github.com/janschulz any final comments?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#13763 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AB1b_FOSzzp-Gnvwk0SvEXA7FwPCj4fkks5qaUolgaJpZM4JTSrP
.

@jankatins
Copy link
Contributor

LGTM

@jreback jreback closed this in 59f2557 Jul 29, 2016
@jreback
Copy link
Contributor

jreback commented Jul 29, 2016

ty sir!

@sinhrks sinhrks deleted the union_categoricals_ordered branch July 29, 2016 10:37
jreback pushed a commit that referenced this pull request Aug 3, 2016
 - needed for #13406, follow-up to #13763

Author: Chris <cbartak@gmail.com>
Author: sinhrks <sinhrks@gmail.com>

Closes #13846 from chris-b1/union_categoricals_ordered and squashes the following commits:

3a710f0 [Chris] lint fix
ff0bb5e [Chris] add follow-up PRs to whatsnew
ecb2ae9 [Chris] more tests; handle sorth with ordered
eea1777 [Chris] skip r-esort when possible on fastpath
c559662 [sinhrks] ENH: add sort_categories argument to union_categoricals
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Categorical Categorical Data Type Enhancement Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants