我需要表示数据中每一行所占的整体百分比。技巧是,我需要一个百分比由groupby调用中的父组绑定。我的DF当前如下所示:
category Segment Pageviews
Sitting Age 25-34 2268
Age 35-44 2942
Age 45-53 2209
Age 55+ 3317
Standing Age 25-34 2193
Age 35-44 1664
Age 45-53 1874
Age 55+ 1647
Kneeling Age 25-34 680
Age 35-44 494
Age 45-53 876
Age 55+ 1489
我希望达到的百分比是每个年龄段的坐姿,站姿和跪姿的百分比。
即
category Segment Pageviews Percentage
Sitting Age 25-34 2268 21%
Age 35-44 2942 27%
Age 45-53 2209 20%
Age 55+ 3317 31%
Standing Age 25-34 2193 ...
Age 35-44 1664 ...
Age 45-53 1874 ...
Age 55+ 1647
Kneeling Age 25-34 680
Age 35-44 494
Age 45-53 876
Age 55+ 1489
最佳答案
您可以使用:
>>> df['Percentage'] = df.groupby('category')['Pageviews']\
.apply(lambda g: 100*g / g.sum())
category Segment Pageviews Percentage
0 Sitting Age25-34 2268 21.125186
1 Sitting Age35-44 2942 27.403130
2 Sitting Age45-53 2209 20.575633
3 Sitting Age55+ 3317 30.896051
4 Standing Age25-34 2193 29.723502
5 Standing Age35-44 1664 22.553538
6 Standing Age45-53 1874 25.399837
7 Standing Age55+ 1647 22.323123
8 Kneeling Age25-34 680 19.214467
9 Kneeling Age35-44 494 13.958745
10 Kneeling Age45-53 876 24.752755
11 Kneeling Age55+ 1489 42.074032