检查元素是否已经在队列中

检查元素是否已经在队列中

本文介绍了检查元素是否已经在队列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 python 中使用 Queue 库,我想保持队列条目的唯一性.

因此,我想在添加到队列之前检查某物"是否已经在队列中,本质上是这样的函数,它适用于队列库:

queue = Queue.Queue()def in_queue(u):在队列中返回你

或者,我应该使用不同的库/方法来实现这一点吗?

解决方案

标准 Queue 类不能被迭代或以其他方式检查.

然而,它是为扩展而构建的.

首先,如果您查看来源 (从文档链接),有钩子方法 _init_qsize_put_get您可以覆盖以更改实现.看看主类下面的子类,你就可以看到它们是如何做到的.

所以,一件简单的事情就是用 set 替换 deque 实现:

class SetQueue(Queue.Queue):def _init(self, maxsize):self.queue = set()def _put(self, item):self.queue.add(item)def_get(self):返回 self.queue.pop()

(我没有实现 _qsize 因为默认的 return len(self.queue) 很好.)

现在不用检查了,直接加入队列,如果已经存在就会被忽略.

当然,这有一个不利方面,即队列不再排序.但是您可以通过使用 OrderedSet(类似于 collections 中的 OrderedDict)来解决这个问题.有一个 recipe 链接自 collections 文档.一旦你有了:

class OrderedSetQueue(Queue.Queue):def _init(self, maxsize):self.queue = OrderedSet()def _put(self, item):self.queue.add(item)def_get(self):返回 self.queue.pop()

如果您确实希望能够检查队列中的值,您可以为此添加一个方法:

class CheckableQueue(Queue.Queue): # 或 OrderedSetQueuedef __contains__(self, item):使用 self.mutex:返回 self.queue 中的项目

然而,这会在您的代码中引发竞争条件.例如,如果您这样做:

如果 x 不在 my_queue 中:my_queue.put(x)

总是有可能当您检查时x 不在队列中,但是当您调用put 在队列中.事实上,这个函数唯一不会不安全的用途是某种乐观检查(如果值不在队列中现在,做一些昂贵的工作,然后尝试添加它,接受如果在此期间添加了值,则工作被浪费了)——同样的原因 Queue.full() 存在.

确保安全的唯一方法是将两个操作放在一起锁定:

 与 my_queue.mutex:如果 x 不在 my_queue 中:my_queue.put(x)

但在这一点上,您首先违背了使用 Queue 的目的.(您还依赖于 Queue.mutex 是一个可递归输入的互斥锁这一事实.)最好将该操作添加为您的 Queue 子类的方法.>

如果您总是想先检查并仅在它不存在时添加,OrderedSetQueue 是一个更好的方法.

I am using the Queue library in python and I want to keep queue entries unique.

As such I want to check 'something' isn't already in the queue before adding to it, essentially a function like this which works on the Queue library:

queue = Queue.Queue()
def in_queue(u):
  return u in queue

Or, should I be using a different library/method to achieve this?

解决方案

The standard Queue class can't be iterated or otherwise checked.

However, it was built to be extended.

First, if you look at the source (which is linked from the docs), there are hook methods _init, _qsize, _put and _get that you can override to change the implementation. Look at the subclasses below the main class, and you can see how they do this.

So, one easy thing to do is replace the deque implementation with a set:

class SetQueue(Queue.Queue):
    def _init(self, maxsize):
        self.queue = set()
    def _put(self, item):
        self.queue.add(item)
    def _get(self):
        return self.queue.pop()

(I didn't implement _qsize because the default return len(self.queue) is fine.)

Now you don't have to check, just add it to the queue, and it'll be ignored if it's already there.

Of course this has the down side that the queue is no longer ordered. But you can solve that by using an OrderedSet (similar to the OrderedDict in collections). There's a recipe that's linked from the collections docs. Once you have that:

class OrderedSetQueue(Queue.Queue):
    def _init(self, maxsize):
        self.queue = OrderedSet()
    def _put(self, item):
        self.queue.add(item)
    def _get(self):
        return self.queue.pop()


If you actually want to be able to check values within a queue, you can add a method for that:

class CheckableQueue(Queue.Queue): # or OrderedSetQueue
    def __contains__(self, item):
        with self.mutex:
            return item in self.queue

However, this invites race conditions in your code. For example, if you do this:

if x not in my_queue:
    my_queue.put(x)

It's always possible that x was not in the queue when you checked, but was in the queue when you called put. In fact, the only use of this function which wouldn't be unsafe is some kind of optimistic checking (if the value isn't in the queue now, do some expensive work, then try to add it, accepting that the work is wasted if the value has been added in the meantime)—the same reason Queue.full() exists.

The only way to make this safe is to put both operations together under a lock:

with my_queue.mutex:
    if x not in my_queue:
        my_queue.put(x)

But at this point, you're defeating the purpose of using Queue in the first place. (You're also depending on the fact that Queue.mutex is a recursively-enterable mutex.) Better to add the operation as a method of your Queue subclass.

And if you always want to check first and add only if it's not there, OrderedSetQueue is a better way to do that.

这篇关于检查元素是否已经在队列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-16 02:21