本文介绍了 pandas 在分组后获得行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我具有以下数据集:
Suppose I have the following dataset:
uid iid val
1 1 2
1 2 3
1 3 4
1 4 4.5
1 5 5.5
2 1 3
2 2 3
2 3 4
3 4 4.5
3 5 5.5
从此数据中,我想首先对uid进行分组,然后从每个uid中获取行数的最后20%.
From this data, I want to first groupby uid, then get last 20% of number of rows from each uid.
也就是说,由于uid = 1有5行,因此我想从uid = 1获取最后1行(5%的20%).
That is, since uid=1 has 5 rows, I want to obtain last 1 row (20% of 5) from uid=1.
以下是我想做的事情:
df.groupby('uid').tail([20% of each uid])
有人可以帮助我吗?
推荐答案
您可以尝试将自定义函数应用于groupby
对象.在函数内部,计算应该占用多少行,并使用该行数获取组的tail
. int
四舍五入为0,因此任何少于5行的组都不会为结果贡献任何行.
You can try applying a custom function to groupby
object. Inside the function calculate how many rows should be taken and take the group's tail
with that number of rows. int
rounds toward 0, so any groups with less than 5 rows will not contribute any rows to the result.
df.groupby('uid').apply(lambda x: x.tail(int(0.2*x.shape[0])))
这篇关于 pandas 在分组后获得行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!