我有一个数据框,例如:

col-a   col-b
1       None
1       Failed
1       Passed
2       None
2       Passed
3       Inconclusive
3       Passed

以及术语层次:
Failed > Inconclusive > Passed > None

我怎么能得到这样的东西:
1       Failed
2       Passed
3       Inconclusive

谢谢!

最佳答案

您可以为Series.map创建的列创建字典,然后使用DataFrame.sort_values按两列排序,并按DataFrame.drop_duplicates获取每个组的第一个唯一行:

d = {'Failed':0,'Inconclusive':1, 'Passed':2, None: 3}
df['new'] = df['col-b'].map(d)
df = df.sort_values(['col-a', 'new']).drop_duplicates('col-a').drop('new', 1)
print (df)
   col-a         col-b
1      1        Failed
4      2        Passed
5      3  Inconclusive

另一个关于DataFrameGroupBy.idxmin的想法:
d = {'Failed':0,'Inconclusive':1, 'Passed':2, None: 3}
df =  df.loc[df['col-b'].map(d).groupby(df['col-a']).idxmin()]
print (df)
   col-a         col-b
1      1        Failed
4      2        Passed
5      3  Inconclusive

关于python - python按列分组并按层次选择,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57802242/

10-11 14:32