问题描述
我在 python 中使用 Queue
库,我想保持队列条目的唯一性.
因此,我想在添加到队列之前检查某物"是否已经在队列中,本质上是这样的函数,它适用于队列库:
queue = Queue.Queue()def in_queue(u):在队列中返回你
或者,我应该使用不同的库/方法来实现这一点吗?
标准 Queue
类不能被迭代或以其他方式检查.
然而,它是为扩展而构建的.
首先,如果您查看来源 (从文档链接),有钩子方法 _init
、_qsize
、_put
和 _get
您可以覆盖以更改实现.看看主类下面的子类,你就可以看到它们是如何做到的.
所以,一件简单的事情就是用 set
替换 deque
实现:
class SetQueue(Queue.Queue):def _init(self, maxsize):self.queue = set()def _put(self, item):self.queue.add(item)def_get(self):返回 self.queue.pop()
(我没有实现 _qsize
因为默认的 return len(self.queue)
很好.)
现在不用检查了,直接加入队列,如果已经存在就会被忽略.
当然,这有一个不利方面,即队列不再排序.但是您可以通过使用 OrderedSet
(类似于 collections
中的 OrderedDict
)来解决这个问题.有一个 recipe 链接自 collections
文档.一旦你有了:
class OrderedSetQueue(Queue.Queue):def _init(self, maxsize):self.queue = OrderedSet()def _put(self, item):self.queue.add(item)def_get(self):返回 self.queue.pop()
如果您确实希望能够检查队列中的值,您可以为此添加一个方法:
class CheckableQueue(Queue.Queue): # 或 OrderedSetQueuedef __contains__(self, item):使用 self.mutex:返回 self.queue 中的项目
然而,这会在您的代码中引发竞争条件.例如,如果您这样做:
如果 x 不在 my_queue 中:my_queue.put(x)
总是有可能当您检查时x
不在队列中,但是当您调用put
时是 在队列中.事实上,这个函数唯一不会不安全的用途是某种乐观检查(如果值不在队列中现在,做一些昂贵的工作,然后尝试添加它,接受如果在此期间添加了值,则工作被浪费了)——同样的原因 Queue.full()
存在.
确保安全的唯一方法是将两个操作放在一起锁定:
与 my_queue.mutex:如果 x 不在 my_queue 中:my_queue.put(x)
但在这一点上,您首先违背了使用 Queue
的目的.(您还依赖于 Queue.mutex
是一个可递归输入的互斥锁这一事实.)最好将该操作添加为您的 Queue
子类的方法.>
如果您总是想先检查并仅在它不存在时添加,OrderedSetQueue
是一个更好的方法.
I am using the Queue
library in python and I want to keep queue entries unique.
As such I want to check 'something' isn't already in the queue before adding to it, essentially a function like this which works on the Queue library:
queue = Queue.Queue()
def in_queue(u):
return u in queue
Or, should I be using a different library/method to achieve this?
The standard Queue
class can't be iterated or otherwise checked.
However, it was built to be extended.
First, if you look at the source (which is linked from the docs), there are hook methods _init
, _qsize
, _put
and _get
that you can override to change the implementation. Look at the subclasses below the main class, and you can see how they do this.
So, one easy thing to do is replace the deque
implementation with a set
:
class SetQueue(Queue.Queue):
def _init(self, maxsize):
self.queue = set()
def _put(self, item):
self.queue.add(item)
def _get(self):
return self.queue.pop()
(I didn't implement _qsize
because the default return len(self.queue)
is fine.)
Now you don't have to check, just add it to the queue, and it'll be ignored if it's already there.
Of course this has the down side that the queue is no longer ordered. But you can solve that by using an OrderedSet
(similar to the OrderedDict
in collections
). There's a recipe that's linked from the collections
docs. Once you have that:
class OrderedSetQueue(Queue.Queue):
def _init(self, maxsize):
self.queue = OrderedSet()
def _put(self, item):
self.queue.add(item)
def _get(self):
return self.queue.pop()
If you actually want to be able to check values within a queue, you can add a method for that:
class CheckableQueue(Queue.Queue): # or OrderedSetQueue
def __contains__(self, item):
with self.mutex:
return item in self.queue
However, this invites race conditions in your code. For example, if you do this:
if x not in my_queue:
my_queue.put(x)
It's always possible that x
was not in the queue when you checked, but was in the queue when you called put
. In fact, the only use of this function which wouldn't be unsafe is some kind of optimistic checking (if the value isn't in the queue now, do some expensive work, then try to add it, accepting that the work is wasted if the value has been added in the meantime)—the same reason Queue.full()
exists.
The only way to make this safe is to put both operations together under a lock:
with my_queue.mutex:
if x not in my_queue:
my_queue.put(x)
But at this point, you're defeating the purpose of using Queue
in the first place. (You're also depending on the fact that Queue.mutex
is a recursively-enterable mutex.) Better to add the operation as a method of your Queue
subclass.
And if you always want to check first and add only if it's not there, OrderedSetQueue
is a better way to do that.
这篇关于检查元素是否已经在队列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!