我现在肯定做错了,我的大脑正在融化。
我有这个数据
queryset = [
{'source_id': '1', 'gender_id': 'female', 'total': 12928604, 'percentage': {'neutral': [8284384, 64.08], 'positive': [3146438, 24.34], 'negative': [1497782, 11.59]}},
{'source_id': '1', 'gender_id': 'male', 'total': 15238856, 'percentage': {'neutral': [10042152, 65.9], 'positive': [2476421, 16.25], 'negative': [2720283, 17.85]}},
{'source_id': '1', 'gender_id': 'null', 'total': 6, 'percentage': {'neutral': [5, 83.33], 'positive': [1, 16.67], 'negative': [0, 0.0]}},
{'source_id': '2', 'gender_id': 'female', 'total': 23546499, 'percentage': {'neutral': [15140308, 64.3], 'positive': [5372964, 22.82], 'negative': [3033227, 12.88]}},
{'source_id': '2', 'gender_id': 'male', 'total': 15349754, 'percentage': {'neutral': [10137025, 66.04], 'positive': [2413350, 15.72], 'negative': [2799379, 18.24]}},
{'source_id': '2', 'gender_id': 'null', 'total': 3422, 'percentage': {'neutral': [2464, 72.0], 'positive': [437, 12.77], 'negative': [521, 15.23]}}
{'source_id': '3', 'gender_id': 'female', 'total': 29417761, 'percentage': {'neutral': [18944384, 64.4], 'positive': [7181996, 24.41], 'negative': [3291381, 11.19]}},
{'source_id': '3', 'gender_id': 'male', 'total': 27200788, 'percentage': {'neutral': [17827887, 65.54], 'positive': [4179990, 15.37], 'negative': [5192911, 19.09]}},
{'source_id': '3', 'gender_id': 'null', 'total': 32909, 'percentage': {'neutral': [22682, 68.92], 'positive': [4005, 12.17], 'negative': [6222, 18.91]}}
]
我想要的输出是
[ {'source_id:1', 'total': 28167466(sum of 'male, female, null' total
values for source id=1) , percentage: {'neutral':[18326541,
65.06(getting the % out of neutral value from total)], 'positive':
[5622859, 19.96], 'negative':[4218065,14.97], {and do the same for all sources}]
我做什么但不起作用,我有3if语句适用于所有3个ID
for i in queryset:
if i['source_id'] == '1':
output['percentage'] = {
'neutral': [sum(i['percentage']['neutral'][0] for i in queryset if i['source_id'] == '1'),
round(output['negative'] / output['2_total'] * 100, 2)],
'positive': [sum(i['percentage']['positive'][0] for i in queryset if i['source_id'] == '2'),
round(output['positive'] / output['2_total'] * 100, 2)],
'negative': [sum(i['percentage']['negative'][0] for i in queryset if i['source_id'] == '2'),
round(output['negative'] / output['2_total'] * 100, 2)]}
最佳答案
好吧,如果我理解正确,这就是您想要的:
unique_ids = set([item.get('source_id') for item in queryset]) # unique source ids
output = []
for id_ in unique_ids:
# only grab items that match the current source id
to_agg = list(filter(lambda x: x.get('source_id') == id_, queryset))
# sum the total field for this source id
total = sum((item.get('total') for item in to_agg))
# aggregate the data for neutral/positive/negative
percents = [item.get('percentage') for item in to_agg]
negatives = sum((item.get('negative')[0] for item in percents))
positives = sum((item.get('positive')[0] for item in percents))
neutrals = sum((item.get('neutral')[0] for item in percents))
# construct the final dictionary
d = {'source_id': id_,
'total': total,
'percentage': {'neutral': [neutrals, round(neutrals / total * 100, 2)],
'positives': [positives, round(positives / total * 100, 2)],
'negative': [negatives, round(negatives / total * 100, 2)]}}
output.append(d)
sorted(output, key=lambda x: x.get('source_id'))
[{'percentage': {'negative': [4218065, 14.97],
'neutral': [18326541, 65.06],
'positives': [5622860, 19.96]},
'source_id': '1',
'total': 28167466},
{'percentage': {'negative': [5833127, 15.0],
'neutral': [25279797, 64.99],
'positives': [7786751, 20.02]},
'source_id': '2',
'total': 38899675},
{'percentage': {'negative': [8490514, 14.99],
'neutral': [36794953, 64.95],
'positives': [11365991, 20.06]},
'source_id': '3',
'total': 56651458}]
编辑:请记住,我尚未优化此答案,因此如果您的查询集很大,它可能不会像您需要的那样快。
关于python - Python计算字典中的总值和百分比,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/55201023/