目前,我有这种数据:
Item Properties
A C001
A C002
A C003
B C001
B C003
C C001
我想将这些项目归类为
A C001, C002, C003
B C001, C003
C C001
然后,我想根据属性相似性来匹配那些项目:
A B 2
A C 1
B C 1
如何使用熊猫修改此数据框?我确实使用了groupby方法,但是它显示的是属性数量而不是属性名称数组。
最佳答案
import pandas as pd
selfjoin = pd.merge(df, df, on = 'Property')
similarity = selfjoin.groupby(('Item_x', 'Item_y'), as_index=False).size()