本文介绍了如何在python pandas的同一列上进行分组并取唯一计数和某个值计数作为聚合?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的问题与我以前的,但有所不同.所以我要问新问题.
My question is related to my previous Question but it's different. So I am asking the new question.
在上述问题中,请参阅@jezrael的答案.
In above question see the answer of @jezrael.
df = pd.DataFrame({'col1':[1,1,1],
'col2':[4,4,6],
'col3':[7,7,9],
'col4':[3,3,5]})
print (df)
col1 col2 col3 col4
0 1 4 7 3
1 1 4 7 3
2 1 6 9 5
df1 = df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique'})
df1['result_col'] = df1['col3'].div(df1['col4'])
print (df1)
col4 col3 result_col
col1 col2
1 4 1 2 2.0
6 1 1 1.0
现在在这里我要对col4
的特定值进行计数.假设我也想在同一查询中计算col4 == 3
.
Now here I want to take count for the specific value of col4
. Say I also want to take count of col4 == 3
in the same query.
df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique'}) ... + count(col4=='3')
如何在上述相同的查询中执行此操作,但我已经尝试过但未获得解决方案.
How to do this in same above query I have tried bellow but not getting solution.
df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique','col4':'x: lambda x[x == 7].count()'})
推荐答案
通过提前将col4==3
作为列进行一些预处理.然后使用aggregate
Do some preprocessing by including the col4==3
as a column ahead of time. Then use aggregate
df.assign(result_col=df.col4.eq(3).astype(int)).groupby(
['col1', 'col2']
).agg(dict(col3='size', col4='nunique', result_col='sum'))
col3 result_col col4
col1 col2
1 4 2 2 1
6 1 0 1
旧答案
old answers
g = df.groupby(['col1', 'col2'])
g.agg({'col3':'size','col4': 'nunique'}).assign(
result_col=g.col4.apply(lambda x: x.eq(3).sum()))
col3 col4 result_col
col1 col2
1 4 2 1 2
6 1 1 0
稍微重新排列
slightly rearranged
g = df.groupby(['col1', 'col2'])
final_df = g.agg({'col3':'size','col4': 'nunique'})
final_df.insert(1, 'result_col', g.col4.apply(lambda x: x.eq(3).sum()))
final_df
col3 result_col col4
col1 col2
1 4 2 2 1
6 1 0 1
这篇关于如何在python pandas的同一列上进行分组并取唯一计数和某个值计数作为聚合?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!