-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Code Sample
If we try to define a dataframe using a dictionary containing a set, we get:
pd.DataFrame({'a':{1,2,3}})
a
0 {1, 2, 3}
1 {1, 2, 3}
2 {1, 2, 3}Problem description
The set is being replicated n times, n being the length of the actual set.
While defining a column with a set directly might not make a lot of sense given that they are by definition unordered collections, the behaviour in any case seems clearly unexpected.
Expected Output
In the case of a list, in order to obtain a single row containing a list, we would have to define a nested list, such as pd.DataFrame({'a':[[1,2,3]]}).
So similarly, with sets I would expect the same behaviour by defining the row with pd.DataFrame({'a':[{1,2,3}]}).
In the case of a single set, even if the order is not guaranteed to be preserved, I'd see more reasonable the same output that we would obtain with:
pd.DataFrame({'a':[1,2,3]})
a
0 1
1 2
2 3So:
pd.DataFrame({'a':{1,2,3}})
a
0 1
1 2
2 3Where:
pd.__version__
# '1.0.0'