我有一个这样的数据框
df = pd.DataFrame({
'User':['101','101','102','102','102'],
'Product':['x','x','x','z','z'],
'Country':['India,Brazil','India','India,Brazil,Japan','India,Brazil','Brazil']
})
我想让国家和产品组合的计数用户明智如下
先把国家分开,然后再和产品结合,计算一下。
所需输出:
最佳答案
下面是一种结合其他答案的方法(它只是显示了搜索的能力:D)
import pandas as pd
df = pd.DataFrame({
'User':['101','101','102','102','102'],
'Product':['x','x','x','z','z'],
'Country':['India,Brazil','India','India,Brazil,Japan','India,Brazil','Brazil']
})
# Making use of: https://stackoverflow.com/a/37592047/7386332
j = (df.Country.str.split(',', expand=True).stack()
.reset_index(drop=True, level=1)
.rename('Country'))
df = df.drop('Country', axis=1).join(j)
# Reformat to get desired Country_Product
df = (df.drop(['Country','Product'], 1)
.assign(Country_Product=['_'.join(i) for i in zip(df['Country'], df['Product'])]))
df2 = df.groupby(['User','Country_Product'])['User'].count().rename('Count').reset_index()
print(df2)
返回:
User Country_Product count
0 101 Brazil_x 1
1 101 India_x 2
2 102 Brazil_x 1
3 102 Brazil_z 2
4 102 India_x 1
5 102 India_z 1
6 102 Japan_x 1
关于python - 每个用户明智地按条件频率对 Pandas 进行分组,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51225989/