我有一个Python DataFrame,电子邮件在其中重复。我想找到所有重复项并合并它们,以便在电子邮件中附加一组帐号。我也想将第三列保留在合并列中。
AccountID Email Quality_3
1 [email protected] High
2 [email protected]
3 [email protected]
4 [email protected] Medium
5 [email protected]
6 [email protected]
7 [email protected]
8 [email protected]
AccountID Email Quality_3
1, 3, 5, 7 [email protected] High
2, 6 [email protected]
4, 8 [email protected] Medium
我正在查看左右连接,但似乎无法弄清楚。
最佳答案
尝试这个:
df_new=(df.astype(str).groupby('Email')['AccountID','Quality_3']
.agg({'AccountID':lambda x: ','.join(x),'Quality_3':'first'}).reset_index())
print(df_new)
Email AccountID Quality_3
0 [email protected] 1,3,5,7 High
1 [email protected] 4,8 Medium
2 [email protected] 2,6 None