我有一个像这样的Pandas DataFrame:

id     Apple   Apricot   Banana    Climentine   Orange    Pear    Pineapple
01       1        1         0          0          0         0         0
02       0        0         1          1          1         1         0
03       0        0         0          0          1         0         1


我如何生成这样的新DataFrame?

id     fruits
01     Apple, Apricot
02     Banana, Clementine, Orange, Pear
03     Orange, Pineapple

最佳答案

使用melt,使用1过滤每个组的,和最后联接值:

df = pd.DataFrame({
    'id': ['01','02','03'],
    'Apple': [1,0,0],
    'Apricot': [1,0,0],
    'Banana': [0,1,0],
    'Climentine': [0,1,0],
    'Orange': [0,1,1],
    'Pear': [0,1,0],
    'Pineapple': [0,0,1]
})

df = (df.melt('id', var_name='fruits').query('value == 1')
       .groupby('id')['fruits']
       .apply(', '.join)
       .reset_index())

print (df)

#   id                            fruits
#0   1                    Apple, Apricot
#1   2  Banana, Climentine, Orange, Pear
#2   3                 Orange, Pineapple


为了获得更好的性能,请使用dot进行矩阵乘法:

df = df.set_index('id')
df = df.dot(df.columns + ', ').str.rstrip(', ').reset_index(name='fruit')
print (df)
   id                             fruit
0  01                    Apple, Apricot
1  02  Banana, Climentine, Orange, Pear
2  03                 Orange, Pineapple

关于python - 将假人值列合并为一列(pd.get_dummies反向),我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/50559480/

10-12 18:29