Skip to content

C() operation in formulas should inherit cat_missing_method #504

@MatthiasSchmidtblaicherQC

Description

When using C in a formula to explicitly convert to a categorical, cat_missing_method is ignored and set to default "fail" instead:

# %%
import pandas as pd
import tabmat as tm
# %% '4.1.5'
tm.__version__
# %% whether "x" is already a categorical does not matter
df = pd.DataFrame({"x": ["a", "b", pd.NA]}) # pd.DataFrame({"x": pd.Categorical(["a", "b", pd.NA])})
# %% result as expected
tm.from_formula(formula="x", data=df, cat_missing_method="zero")
# %% result as expected
tm.from_formula(formula="x", data=df, cat_missing_method="convert")
# %% # raises ValueError: Categorical data can't have missing values if cat_missing_method='fail'.
tm.from_formula(formula="C(x)", data=df, cat_missing_method="zero")
# %% # raises ValueError: Categorical data can't have missing values if cat_missing_method='fail'.
tm.from_formula(formula="C(x)", data=df, cat_missing_method="convert")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions