问题描述
我有一个排序列表 l
(大约 20,000 个元素),并且想在 l
中找到超过给定值 t_min 的第一个元素.目前,我的代码如下.
I have a sorted list l
(of around 20,000 elements), and would like to find the first element in l
that exceeds a given value t_min. Currently, my code is as follows.
def find_index(l):
first=next((t for t in l if t>t_min), None)
if first==None:
return None
else:
return l.index(first)
为了对代码进行基准测试,我使用 cProfile
运行测试循环,并通过将时间与控制循环进行比较来去除随机生成列表所需的时间:
To benchmark the code, I used cProfile
to run a testing loop, and stripped out the time required to randomly generate lists by comparing the time to a control loop:
import numpy
import cProfile
def test_loop(n):
for _ in range(n):
test_l=sorted(numpy.random.random_sample(20000))
find_index(test_l, 0.5)
def control_loop(n):
for _ in range(n):
test_l=sorted(numpy.random.random_sample(20000))
# cProfile.run('test_loop(1000)') takes 10.810 seconds
# cProfile.run('control_loop(1000)') takes 9.650 seconds
find_index
的每个函数调用大约需要 1.16 毫秒.鉴于我们知道列表已排序,有没有办法改进代码以使其更高效?
Each function call for find_index
takes about 1.16 ms. Is there a way to improve the code to make it more efficient, given that we know the list is sorted?
推荐答案
标准库 bisect
模块对此很有用,文档 包含这个用例的示例.
The standard library bisect
module is useful for this, and the docs contain an example of exactly this use case.
def find_gt(a, x):
'Find leftmost value greater than x'
i = bisect_right(a, x)
if i != len(a):
return a[i]
raise ValueError
这篇关于高效查找排序列表中元素的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!