我有一个DF,如下所示:
BacksGas_Flow_sccm ContextID StepID Time_Elapsed iso_forest anomaly_score alarm
96.875 7296124 19 39.798 -1 -0.22435033280902072 3
96.875 7296125 19 39.993 -1 -0.22435033280902072 3
96.875 7296406 19 39.829 -1 -0.22435033280902072 3
96.875 7296405 19 39.243 -1 -0.22435033280902072 3
96.6796875 7317148 19 38.801 -1 -0.22435033280902072 3
96.6796875 7317149 19 38.801 -1 -0.22435033280902072 3
96.58203125 7293851 19 40.226 -1 -0.22435033280902072 3
96.58203125 7293852 19 40.031000000000006 -1 -0.22435033280902072 3
96.38671875 7293732 19 39.945 -1 -0.22435033280902072 3
96.38671875 7293731 19 39.945 -1 -0.22435033280902072 3
95.80078125 7297416 19 39.666000000000004 -1 -0.22435033280902072 3
95.80078125 7297415 19 39.541000000000004 -1 -0.22435033280902072 3
18.5546875 7321507 19 38.107 -1 -0.25368125176672074 -3
18.5546875 7322950 19 37.734 -1 -0.25368125176672074 -3
18.45703125 7320222 19 37.906000000000006 -1 -0.25368125176672074 -3
18.45703125 7323150 19 37.755 -1 -0.25368125176672074 -3
18.45703125 7323151 19 38.02 -1 -0.25368125176672074 -3
18.45703125 7320221 19 38.069 -1 -0.25368125176672074 -3
18.359375 7291023 19 37.718 -1 -0.25420996401901275 -3
18.359375 7291024 19 37.933 -1 -0.25420996401901275 -3
18.26171875 7316192 19 38.741 -1 -0.25420996401901275 -3
18.26171875 7312681 19 38.084 -1 -0.25420996401901275 -3
18.26171875 7312682 19 37.830000000000005 -1 -0.25420996401901275 -3
18.26171875 7316191 19 37.679 -1 -0.25420996401901275 -3
18.1640625 7291050 19 38.299 -1 -0.25420996401901275 -3
18.1640625 7311617 19 38.031000000000006 -1 -0.25420996401901275 -3
18.1640625 7324929 19 38.119 -1 -0.25420996401901275 -3
18.1640625 7291049 19 37.841 -1 -0.25420996401901275 -3
18.1640625 7311618 19 38.031000000000006 -1 -0.25420996401901275 -3
18.1640625 7324930 19 38.119 -1 -0.25420996401901275 -3
18.06640625 7306076 19 38.098 -1 -0.25420996401901275 -3
18.06640625 7317385 19 37.967000000000006 -1 -0.25420996401901275 -3
18.06640625 7316312 19 38.169000000000004 -1 -0.25420996401901275 -3
18.06640625 7306077 19 38.098 -1 -0.25420996401901275 -3
18.06640625 7317386 19 37.967000000000006 -1 -0.25420996401901275 -3
18.06640625 7316311 19 38.169000000000004 -1 -0.25420996401901275 -3
我想从
BacksGas_Flow_sccm
列获取属于最高3和最低3值的所有行。在上面的df中:
BacksGas_Flow_sccm
列中的最高3个值是:96.875、96.6796875、95.80078125和
BacksGas_Flow_sccm
列中的最低3个值是:18.06640625、18.1640625、18.26171875预期产量:
BacksGas_Flow_sccm ContextID StepID Time_Elapsed iso_forest anomaly_score alarm
96.875 7296124 19 39.798 -1 -0.22435033280902072 3
96.875 7296125 19 39.993 -1 -0.22435033280902072 3
96.875 7296406 19 39.829 -1 -0.22435033280902072 3
96.875 7296405 19 39.243 -1 -0.22435033280902072 3
96.6796875 7317148 19 38.801 -1 -0.22435033280902072 3
96.6796875 7317149 19 38.801 -1 -0.22435033280902072 3
96.58203125 7293851 19 40.226 -1 -0.22435033280902072 3
96.58203125 7293852 19 40.031000000000006 -1 -0.22435033280902072 3
18.26171875 7316192 19 38.741 -1 -0.25420996401901275 -3
18.26171875 7312681 19 38.084 -1 -0.25420996401901275 -3
18.26171875 7312682 19 37.830000000000005 -1 -0.25420996401901275 -3
18.26171875 7316191 19 37.679 -1 -0.25420996401901275 -3
18.1640625 7291050 19 38.299 -1 -0.25420996401901275 -3
18.1640625 7311617 19 38.031000000000006 -1 -0.25420996401901275 -3
18.1640625 7324929 19 38.119 -1 -0.25420996401901275 -3
18.1640625 7291049 19 37.841 -1 -0.25420996401901275 -3
18.1640625 7311618 19 38.031000000000006 -1 -0.25420996401901275 -3
18.1640625 7324930 19 38.119 -1 -0.25420996401901275 -3
18.06640625 7306076 19 38.098 -1 -0.25420996401901275 -3
18.06640625 7317385 19 37.967000000000006 -1 -0.25420996401901275 -3
18.06640625 7316312 19 38.169000000000004 -1 -0.25420996401901275 -3
18.06640625 7306077 19 38.098 -1 -0.25420996401901275 -3
18.06640625 7317386 19 37.967000000000006 -1 -0.25420996401901275 -3
18.06640625 7316311 19 38.169000000000004 -1 -0.25420996401901275 -3
我尝试使用pd.nlargest和pd.nsmallest,但是它给了我错误的输出。
如何才能做到这一点?
提前致谢
最佳答案
您可以通过将drop_duplicates()
和nlargest
结合在一起的nsmallest
来实现:
s=df.BacksGas_Flow_sccm.drop_duplicates()
(df[df.BacksGas_Flow_sccm.isin(pd.concat([s.nlargest(3),s.nsmallest(3)]))]
.reset_index(drop=True))
BacksGas_Flow_sccm ContextID StepID Time_Elapsed iso_forest anomaly_score alarm
0 96.875000 7296124 19 39.798 -1 -0.22435 3
1 96.875000 7296125 19 39.993 -1 -0.22435 3
2 96.875000 7296406 19 39.829 -1 -0.22435 3
3 96.875000 7296405 19 39.243 -1 -0.22435 3
4 96.679688 7317148 19 38.801 -1 -0.22435 3
5 96.679688 7317149 19 38.801 -1 -0.22435 3
6 96.582031 7293851 19 40.226 -1 -0.22435 3
7 96.582031 7293852 19 40.031 -1 -0.22435 3
8 18.261719 7316192 19 38.741 -1 -0.25421 -3
9 18.261719 7312681 19 38.084 -1 -0.25421 -3
10 18.261719 7312682 19 37.830 -1 -0.25421 -3
....
....
关于python - 如何根据一列的值获取前三位和后三位的所有行?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57010527/