我有一个数据框,例如:
col-a col-b
1 None
1 Failed
1 Passed
2 None
2 Passed
3 Inconclusive
3 Passed
以及术语层次:
Failed > Inconclusive > Passed > None
我怎么能得到这样的东西:
1 Failed
2 Passed
3 Inconclusive
谢谢!
最佳答案
您可以为Series.map
创建的列创建字典,然后使用DataFrame.sort_values
按两列排序,并按DataFrame.drop_duplicates
获取每个组的第一个唯一行:
d = {'Failed':0,'Inconclusive':1, 'Passed':2, None: 3}
df['new'] = df['col-b'].map(d)
df = df.sort_values(['col-a', 'new']).drop_duplicates('col-a').drop('new', 1)
print (df)
col-a col-b
1 1 Failed
4 2 Passed
5 3 Inconclusive
另一个关于
DataFrameGroupBy.idxmin
的想法:d = {'Failed':0,'Inconclusive':1, 'Passed':2, None: 3}
df = df.loc[df['col-b'].map(d).groupby(df['col-a']).idxmin()]
print (df)
col-a col-b
1 1 Failed
4 2 Passed
5 3 Inconclusive
关于python - python按列分组并按层次选择,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57802242/