@@ -96,12 +96,14 @@ By passing a :class:`pandas.Categorical` object to a `Series` or assigning it to
9696 df[" B" ] = raw_cat
9797 df
9898
99- You can also specify differently ordered categories or make the resulting data ordered, by passing these arguments to ``astype() ``:
99+ You can also specify differently ordered categories or make the resulting data
100+ ordered by passing a :class: `CategoricalDtype `:
100101
101102.. ipython :: python
102103
103104 s = pd.Series([" a" ," b" ," c" ," a" ])
104- s_cat = s.astype(" category" , categories = [" b" ," c" ," d" ], ordered = False )
105+ cat_type = pd.CategoricalDtype(categories = [" b" , " c" , " d" ], ordered = False )
106+ s_cat = s.astype(cat_type)
105107 s_cat
106108
107109 Categorical data has a specific ``category `` :ref: `dtype <basics.dtypes >`:
@@ -140,6 +142,24 @@ constructor to save the factorize step during normal constructor mode:
140142 splitter = np.random.choice([0 ,1 ], 5 , p = [0.5 ,0.5 ])
141143 s = pd.Series(pd.Categorical.from_codes(splitter, categories = [" train" , " test" ]))
142144
145+
146+ CategoricalDtype
147+ ----------------
148+
149+ A categorical's type is fully described by 1.) its categories (an iterable with
150+ unique values and no missing values), and 2.) its orderedness (a boolean).
151+ This information can be stored in a :class: `~pandas.CategoricalDtype `.
152+ The ``categories `` argument is optional, which implies that the actual categories
153+ should be inferred from whatever is present in the data.
154+
155+ A :class: `~pandas.CategoricalDtype ` can be used in any place pandas expects a
156+ `dtype `. For example :func: `pandas.read_csv `, :func: `pandas.DataFrame.astype `,
157+ the Series constructor, etc.
158+
159+ As a convenience, you can use the string `'category' ` in place of a
160+ :class: `pandas.CategoricalDtype ` when you want the default behavior of
161+ the categories being unordered, and equal to the set values present in the array.
162+
143163Description
144164-----------
145165
0 commit comments