问题描述
我想找到一种使用 postgres 和 PDO (php) 来实现作业队列的好方法.
基本上我有一个 events
表,其中记录了应用程序的事件以及某种形式的调度处理器(例如 proc
),它会定期负责在时间并执行某些例程来响应它(并且取决于它本身的事件的性质).
显然,一旦 proc
的实例开始处理事件,我需要将该行标记为 ongoing
,如下所示:
更新事件 SET status = "ongoing" WHERE id = 3;-->>查询 1 <<
很好!proc
现在可以根据事件的类型及其有效负载来执行其业务,并且没有其他线程将处理 id = 3 的事件,因为它现在 ongoing
.p>
当 proc
处理完事件 3 时,它会将其标记为已解决",这样,以后再没有其他线程会处理事件 3.我们开始吧:
UPDATE 事件 SET status = "resolved" WHERE id = 3;-->>查询 2 <<
现在我担心这必须在事务中完成,所以我会:
开始;-- 问题 1-- 需要很长时间的各种其他查询-- 问题 2犯罪;
据我所知,在事务内部,QUERY 1 操作的更改仅在提交整个事务时对其他线程可见.这意味着当 proc
(实例 1)正在执行耗时的工作(QUERY 1 和 QUERY 2 之间的代码)时,它的其他一些实例可能会读取 events
表并认为没有人在处理事件 3 并继续处理它.显然,这会搞砸整个事情并破坏队列的状态.
所以我的问题是:如何保留 proc
的事务风格,同时改变事件 3 的状态(来自 free
到 ongoing
) 立即 从事务外部可见?
正如它所呈现的,这是不可能的.PostgreSQL 没有脏读和QUERY1
是没有意义的,因为它的效果会在可见之前被 QUERY2
覆盖.
但是即使它被提交并且立即可见(如果独立提交),这无论如何也不会令人满意.在高并发环境中,队列中一行的 SELECT 与其处于 ongoing
状态的 UPDATE 之间的时间足以让另一个工作人员也选择它并产生您想要避免的混乱.
我认为可以通过将 QUERY1
替换为 对队列 ID 的建议锁定.
伪代码:
开始;选择 pg_try_advisory_xact_lock(3) INTO 结果;如果结果=真 那么-- 获得了本条目的独家处理权-- 现在重新检查状态,锁定已被占用SELECT status INTO var_status FROM events WHERE id=3;IF var_status='需要完成' THEN——做作业……——工作完成UPDATE 事件 SET status = 'resolved' WHERE id = 3;万一;别的-- 无事可做,另一个工人正在处理万一;犯罪;
这种锁在事务结束时自动释放.与 SELECT 后跟 UPDATE 不同,锁是保证自动授予或拒绝的.
I would like to find out a good way to go about implementing a jobs queue using postgres and PDO (php).
Basically I have an events
table where the app's events are logged and some form of scheduled processor (say proc
) that will regularly take care of retrieving an event at a time and execute certain routines in response to it (and depending on the nature of the event it self).
Clearly, as soon as an instance of proc
starts working on an event, I need to mark the row as ongoing
, like that:
UPDATE events SET status = "ongoing" WHERE id = 3; -- >> QUERY 1 <<
Fine! proc
can now do its business according to the type of event and its payload and no other thread will deal with the event of id = 3 as it is now ongoing
.
When proc
is done with event 3 it marks it as 'resolved' so that, again, no other thread will, in the future, take care of event 3. Here we go:
UPDATE events SET status = "resolved" WHERE id = 3; -- >> QUERY 2 <<
Now my concern is that this must be done inside a transaction, so I would have:
BEGIN;
-- QUERY 1
-- VARIOUS OTHER QUERIES TAKING A LOT OF TIME
-- QUERY 2
COMMIT;
As far as I know, when inside a transaction, the change operated by QUERY 1 is only visible to other threads when the whole transaction is committed. That implies that while proc
(instance 1) is doing the time consuming work (the code between QUERY 1 and QUERY 2) some other instance of it might read the events
table and think that no one is taking care of event 3 and move on doing stuff with it. Clearly that would mess up the whole thing and corrupt the state of the queue.
So my question is: how do I preserve the transactional style of proc
and, at the same time, make the change of state of event 3 (from free
to ongoing
) immediately visible from outside the transaction?
As it is presented, this is not possible. PostgreSQL doesn't have dirty reads, and QUERY1
is pointless since its effect will be overrided by QUERY2
before ever being visible.
But even if it was committed and visible immediately (if committed independantly), this wouldn't be satisfying anyway. In a high concurrency environment, the time between the SELECT of a row in the queue and its UPDATE with the ongoing
state is enough for another worker to SELECT it too and create the confusion you want to avoid.
I think a close alternative to your design that should work can be achieved by replacing your QUERY1
with an advisory lock on the queue ID.
Pseudo-code:
BEGIN;
SELECT pg_try_advisory_xact_lock(3) INTO result;
IF result=true THEN
-- grabbed the exclusive right to process this entry
-- recheck the status now that the lock is taken
SELECT status INTO var_status FROM events WHERE id=3;
IF var_status='needs-to-be-done' THEN
-- do the work...
-- work is done
UPDATE events SET status = 'resolved' WHERE id = 3;
END IF;
ELSE
-- nothing to do, another worker is on it
END IF;
COMMIT;
This kind of lock is automatically released at the end of the transaction.Contrary to the SELECT followed by UPDATE, the lock is guaranteed to be granted or denied atomically.
这篇关于在 php 和 postgres 中排队的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!