I would like to find out a good way to go about implementing a jobs queue using postgres and PDO (php).

Basically I have an events table where the app's events are logged and some form of scheduled processor (say proc) that will regularly take care of retrieving an event at a time and execute certain routines in response to it (and depending on the nature of the event it self).

Clearly, as soon as an instance of proc starts working on an event, I need to mark the row as ongoing, like that:

UPDATE events SET status = "ongoing" WHERE id = 3; -- >> QUERY 1 <<

Fine! proc can now do its business according to the type of event and its payload and no other thread will deal with the event of id = 3 as it is now ongoing.

When proc is done with event 3 it marks it as 'resolved' so that, again, no other thread will, in the future, take care of event 3. Here we go:

UPDATE events SET status = "resolved" WHERE id = 3; -- >> QUERY 2 <<


Now my concern is that this must be done inside a transaction, so I would have:

-- QUERY 1
-- QUERY 2

As far as I know, when inside a transaction, the change operated by QUERY 1 is only visible to other threads when the whole transaction is committed. That implies that while proc (instance 1) is doing the time consuming work (the code between QUERY 1 and QUERY 2) some other instance of it might read the events table and think that no one is taking care of event 3 and move on doing stuff with it. Clearly that would mess up the whole thing and corrupt the state of the queue.

So my question is: how do I preserve the transactional style of proc and, at the same time, make the change of state of event 3 (from free to ongoing) immediately visible from outside the transaction?


As it is presented, this is not possible. PostgreSQL doesn't have dirty reads, and QUERY1 is pointless since its effect will be overrided by QUERY2 before ever being visible.


But even if it was committed and visible immediately (if committed independantly), this wouldn't be satisfying anyway. In a high concurrency environment, the time between the SELECT of a row in the queue and its UPDATE with the ongoing state is enough for another worker to SELECT it too and create the confusion you want to avoid.

I think a close alternative to your design that should work can be achieved by replacing your QUERY1 with an advisory lock on the queue ID.


SELECT pg_try_advisory_xact_lock(3) INTO result;
IF result=true THEN
  -- grabbed the exclusive right to process this entry
  -- recheck the status now that the lock is taken
  SELECT status INTO var_status FROM events WHERE id=3;
  IF var_status='needs-to-be-done' THEN
     -- do the work...
     -- work is done
     UPDATE events SET status = 'resolved' WHERE id = 3;
 -- nothing to do, another worker is on it


This kind of lock is automatically released at the end of the transaction.Contrary to the SELECT followed by UPDATE, the lock is guaranteed to be granted or denied atomically.


08-26 06:46