我有一个数据框,其中包含项目的类别,货币,投资者数量,目标等,并且我想创建一个新列,该列将是“其类别的平均成功率”
state category main_category currency backers country \
0 0 Poetry Publishing GBP 0 GB
1 0 Narrative Film Film & Video USD 15 US
2 0 Narrative Film Film & Video USD 3 US
3 0 Music Music USD 1 US
4 1 Restaurants Food USD 224 US
usd_goal_real duration year hour
0 1533.95 59 2015 morning
1 30000.00 60 2017 morning
2 45000.00 45 2013 morning
3 5000.00 30 2012 morning
4 50000.00 35 2016 afternoon
我有系列格式的平均成功率
Dance 65.435209
Theater 63.796134
Comics 59.141527
Music 52.660558
Art 44.889045
Games 43.890467
Film & Video 41.790649
Design 41.594386
Publishing 34.701650
Photography 34.110847
Fashion 28.283186
Technology 23.785582
现在我想添加一个新列,每个列的成功率都将与其类别匹配,即,无论行是技术领域,新列都将包含该行的23.78
df [category_success_rate] =我希望输出列是成功百分比,与“主类别”列中的类别匹配
最佳答案
我认为您需要带有布尔掩码GroupBy.transform
或df['state'].eq(1)
的(df['state'] == 1)
:
df['category_success_rate'] = (df['state'].eq(1)
.groupby(df['main_category']).transform('mean') * 100)
选择:
df['category_success_rate'] = ((df['state'] == 1)
.groupby(df['main_category']).transform('mean') * 100)
关于python - 根据平均值将新列添加到数据框,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/53738831/