-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Open
Labels
Description
This looks really messy, it's not difficult to find out inconsistent behavior in all three methods. It seems to me that the dataframe implementation is too liberally blurring the distinction between agg, transform and apply.
In: df
Out:
x y z
0 1 1 4
1 1 2 5
2 2 3 6
3 2 1 4
4 3 2 5
5 3 3 6
In: gb = df.groupby('x')
dataframe.applyaccepts lists and dicts whilegroupby.applydoesn't:
In: df.apply(['sum', 'mean'])
Out:
x y z
sum 12.0 12.0 30.0
mean 2.0 2.0 5.0
In: gb.apply(['sum', 'mean'])
TypeError
dataframe.transformdisallows aggregations whilegroupby.transformsbroadcasts them to the original shape:
In: df.transform('sum')
ValueError: transforms cannot produce aggregated results
In: gb.transform('sum')
Out:
y z
0 3 9
1 3 9
2 4 10
3 4 10
4 5 11
5 5 11
dataframe.aggallows non-aggregations whilegroupby.aggdoesn't:
In: df.agg(lambda x: x)
Out:
x y z
0 1 1 4
1 1 2 5
2 2 3 6
3 2 1 4
4 3 2 5
5 3 3 6
In: gb.agg(lambda x: x)
ValueError: cannot copy sequence with size 3 to array axis with dimension 2