问题描述
我有一个看起来像这样的大数据框:
I have a large Dataframe that looks similar to this:
ID_Code Status1 Status2
0 A Done Not
1 A Done Done
2 B Not Not
3 B Not Done
4 C Not Not
5 C Not Not
6 C Done Done
我要对每个重复的ID码进行计算,找出存在Not-Not条目的百分比.(即[Not-Not的数量/总条目的数量] * 100)
What I want to do is calculate is for each of the set of duplicate ID codes, find out the percentage of Not-Not entries are present. (i.e. [# of Not-Not/# of total entries] * 100)
我正在努力使用groupby这样做,而且似乎无法获得正确的语法来执行此操作.
I'm struggling to do so using groupby and can't seem to get the right syntax to perform this.
推荐答案
我可能误解了这个问题,但是您似乎指的是Status1
和Status2
的值都是两者 Not
,对吗?如果是这样,您可以执行以下操作:
I may have misunderstood the question, but you appear to be referring to when values of Status1
and Status2
are both Not
, correct? If that's the case, you can do something like:
df.groupby('ID_Code').apply(lambda x: (x[['Status1','Status2']] == 'Not').all(1).sum()/len(x)*100)
ID_Code
A 0.000000
B 50.000000
C 66.666667
dtype: float64
这篇关于 pandas :对于特定列中的所有重复条目,请获取一些信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!