我有一个DataFrame
,其中某些商品的价格范围:
price_low price_high item
0 10 20 a
1 1 7 b
2 10 12 c
3 20 25 d
4 4 8 e
5 5 30 f
6 16 26 g
如何确定给定价格范围(例如$ 8-$ 16)的哪些商品重叠?
预期产量:
price_low price_high item
0 10 20 a
2 10 12 c
4 4 8 e
5 5 30 f
6 16 26 g
最佳答案
从熊猫0.24.0开始,您可以使用IntervalArray.overlaps
方法,或者使用IntervalIndex.overlaps
方法:
# construct the IntervalArray
price_ivs = pd.arrays.IntervalArray.from_arrays(df['price_low'], df['price_high'], closed='both')
# define desired price Interval and use the overlaps method to restrict df
my_price = pd.Interval(8, 16, closed='both')
df = df[price_ivs.overlaps(my_price)]
为了进行交互式演示,请首先构造示例数据:
In [1]: import pandas as pd; pd.__version__
Out[1]: '0.24.0rc1'
In [2]: df = pd.DataFrame({
...: 'price_low': [10, 1, 10, 20, 4, 5, 16],
...: 'price_high': [20, 7, 12, 25, 8, 30, 26],
...: 'item': list('abcdefg')
...: })
In [3]: df
Out[3]:
price_low price_high item
0 10 20 a
1 1 7 b
2 10 12 c
3 20 25 d
4 4 8 e
5 5 30 f
6 16 26 g
从
IntervalArray
构造一个DataFrame
:In [4]: price_ivs = pd.arrays.IntervalArray.from_arrays(
...: df['price_low'], df['price_high'], closed='both')
In [5]: price_ivs
Out[5]:
IntervalArray([[10, 20], [1, 7], [10, 12], [20, 25], [4, 8], [5, 30], [16, 26]],
closed='both',
dtype='interval[int64]')
定义所需价格
Interval
并使用overlaps
方法获取布尔索引器:In [6]: my_price = pd.Interval(8, 16, closed='both')
In [7]: idxr = price_ivs.overlaps(my_price)
In [8]: idxr
Out[8]: array([ True, False, True, False, True, True, True])
In [9]: df[idxr]
Out[9]:
price_low price_high item
0 10 20 a
2 10 12 c
4 4 8 e
5 5 30 f
6 16 26 g
如果您已经有一列价格作为间隔(或间隔的
Series
),则可以使用array
属性访问基础的IntervalArray
并使用与上述相同的方法:In [10]: df = pd.DataFrame({'price_ivs': price_ivs, 'item': list('abcdefg')})
In [11]: df
Out[11]:
price_ivs item
0 [10, 20] a
1 [1, 7] b
2 [10, 12] c
3 [20, 25] d
4 [4, 8] e
5 [5, 30] f
6 [16, 26] g
In [12]: idxr = df['price_ivs'].array.overlaps(my_price)
In [13]: idxr
Out[13]: array([ True, False, True, False, True, True, True])
In [14]: df[idxr]
Out[14]:
price_ivs item
0 [10, 20] a
2 [10, 12] c
4 [4, 8] e
5 [5, 30] f
6 [16, 26] g