我是熊猫的新手,并且一直坚持计算。这是我的样本DF。我感兴趣的是获取每个组的最近收盘价行价。 (按日期,时间,option_type分组)

        name      date   time      open      high       low     close       option_type  strike_price  open_opn  high_opn  low_opn  close_opn
0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11800       104.95    109.45   104.50     107.25
1       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11700       160.00    164.25   159.90     161.60
2       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11600       223.20    229.10   223.20     226.30
0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11800       166.05    166.95   163.55     165.95
1       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11600       88.80     89.45    86.45      89.00
2       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11700       122.35    122.65   119.40     122.30


这是我的预期输出。

        name      date   time      open      high       low     close       option_type  strike_price  open_opn  high_opn  low_opn  close_opn

1       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11700.15      CE              11700       160.00    164.25   159.90     161.60
2       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11700.15      PE              11700       122.35    122.65   119.40     122.30


如果收盘价为11760,则预期o / p为。

0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      CE              11800       104.95    109.45   104.50     107.25
0       NIFTY  20180903  09:16  11736.05  11736.10  11699.35  11720.15      PE              11800       166.05    166.95   163.55     165.95


请帮我。高度赞赏。

最佳答案

首先将subabs一起使用,然后通过sort_valuesdrop_duplicates将每个组的最小值获取:

df['diff'] = df['close'].sub(df['strike_price']).abs()

df = df.sort_values('diff').drop_duplicates(['date', 'time', 'option_type'])
print (df)
    name      date   time      open     high       low     close option_type  \
1  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          CE
2  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          PE

   strike_price  open_opn  high_opn  low_opn  close_opn   diff
1         11700    160.00    164.25    159.9      161.6  20.15
2         11700    122.35    122.65    119.4      122.3  20.15


DataFrameGroupBy.idxminloc的另一种解决方案:

df = df.reset_index(drop=True)
df['diff'] = df['close'].sub(df['strike_price']).abs()

df = df.loc[df.groupby(['date', 'time', 'option_type'])['diff'].idxmin()]
print (df)

    name      date   time      open     high       low     close option_type  \
1  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          CE
5  NIFTY  20180903  09:16  11736.05  11736.1  11699.35  11720.15          PE

   strike_price  open_opn  high_opn  low_opn  close_opn   diff
1         11700    160.00    164.25    159.9      161.6  20.15
5         11700    122.35    122.65    119.4      122.3  20.15

10-06 08:32