-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
dfo = pd.DataFrame({"A": ["abc", "def"], "B": ["ghi", "jkl"]}, dtype="object")
dfs = pd.DataFrame({"A": ["abc", "def"], "B": ["ghi", "jkl"]}, dtype="string")
dfo.loc[0, :] = {"A": "newA", "B": "newB"}
dfs.loc[0, :] = {"A": "newA", "B": "newB"}
print(dfo)
# Output:
# A B
# 0 newA newB
# 1 def jkl
print(dfs)
# Output:
# A B
# 0 A B
# 1 def jklProblem description
The interpretation of the RHS_dict depends on the dtype of the dataframe assigned to:
Assigning to an object-dataframe: Interpretation as pd.Series(RHS_dict)
Assigning to a string-dataframe: Interpretation as list(RHS_dict)
This is inconsistent and unexpected.
In my script I started with an object-dataframe and later changed to a string-dataframe. This change led to an incorrect result, due to the effect desribed above.
Expected Output
The interpretation of the RHS_dict does NOT depend on the dtype of the dataframe asigned to.
Personally I'd prefer pd.Series(RHS_dict), but if wiser men than me decide that list(RHS_dict) is correct, I won't object, as long as the behaviour is consistent across dtypes.
Output of pd.show_versions()
Details
INSTALLED VERSIONS
commit : 2cb9652
python : 3.9.4.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.18362
machine : AMD64
...
pandas : 1.2.4
numpy : 1.20.2
pytz : 2021.1
dateutil : 2.8.1
...