我有类似于以下的数组:
a=[["tennis","tennis","golf","federer","cricket"],
["federer","nadal","woods","sausage","federer"],
["sausage","lion","prawn","prawn","sausage"]]
然后我得到了一个下列权重的矩阵
w=[[1,3,3,4,5],
[2,3,2,3,4],
[1,2,1,1,1]]
然后我要做的是根据每一行的矩阵a的标签求和权重,并从该行中取前3个标签。最后我想要这样的东西:
res=[["cricket","tennis","federer"],
["federer","sausage","nadal"],
["lion","sausage","prawn"]]
在我的实际数据集中,关联是极不可能的,也不是真正的问题,对于整个行是:
["federer","federer","federer","federer","federer"]
理想情况下,我希望这个作为
[“费德勒”、“”、“”]。
任何指导都将不胜感激。
最佳答案
有关numpy数组,请参见piRSquared answer。
这是一种纯python方法:
for i in range(4):
if a[i].count(a[i][0]) == len(a[i]):
res = [a[1][0], "", ""]
else:
res = [x[0] for x in sorted(zip(a[i], w[i]), key=lambda c: c[1], reverse=True)[:3]]
print(res)