我试着去理解如何用这样的方式来监控resque worker的travis-ci,这样通过god停止resque watch就不会留下一个陈旧的worker进程。
在下面我要讨论的是工作进程,而不是分叉的作业子进程(即队列一直是空的)。
当我像这样手动启动resque worker时:

$ QUEUE=builds rake resque:work

我只需要一个过程:
$ ps x | grep resque
 7041 s001  S+     0:05.04 resque-1.13.0: Waiting for builds

一旦我停止工作任务,这个过程就会消失。
但是当我开始和上帝做同样的事情时(exact configuration is here,基本上和resque/god example)就像这样…
$ RAILS_ENV=development god -c config/resque.god -D
I [2011-03-27 22:49:15]  INFO: Loading config/resque.god
I [2011-03-27 22:49:15]  INFO: Syslog enabled.
I [2011-03-27 22:49:15]  INFO: Using pid file directory: /Volumes/Users/sven/.god/pids
I [2011-03-27 22:49:15]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-03-27 22:49:15]  INFO: resque-0 move 'unmonitored' to 'init'
I [2011-03-27 22:49:15]  INFO: resque-0 moved 'unmonitored' to 'init'
I [2011-03-27 22:49:15]  INFO: resque-0 [trigger] process is not running (ProcessRunning)
I [2011-03-27 22:49:15]  INFO: resque-0 move 'init' to 'start'
I [2011-03-27 22:49:15]  INFO: resque-0 start: cd /Volumes/Users/sven/Development/projects/travis && rake resque:work
I [2011-03-27 22:49:15]  INFO: resque-0 moved 'init' to 'start'
I [2011-03-27 22:49:15]  INFO: resque-0 [trigger] process is running (ProcessRunning)
I [2011-03-27 22:49:15]  INFO: resque-0 move 'start' to 'up'
I [2011-03-27 22:49:15]  INFO: resque-0 moved 'start' to 'up'
I [2011-03-27 22:49:15]  INFO: resque-0 [ok] memory within bounds [784kb] (MemoryUsage)
I [2011-03-27 22:49:15]  INFO: resque-0 [ok] process is running (ProcessRunning)
I [2011-03-27 22:49:45]  INFO: resque-0 [ok] memory within bounds [784kb, 784kb] (MemoryUsage)
I [2011-03-27 22:49:45]  INFO: resque-0 [ok] process is running (ProcessRunning)

然后我会得到一个额外的过程:
$ ps x | grep resque
 7187   ??  Ss     0:00.02 sh -c cd /Volumes/Users/sven/Development/projects/travis && rake resque:work
 7188   ??  S      0:05.11 resque-1.13.0: Waiting for builds
 7183 s001  S+     0:01.18 /Volumes/Users/sven/.rvm/rubies/ruby-1.8.7-p302/bin/ruby /Volumes/Users/sven/.rvm/gems/ruby-1.8.7-p302/bin/god -c config/resque.god -D

上帝似乎只记录了第一个的PID:
$ cat ~/.god/pids/resque-0.pid
7187

当我通过上帝停止观察时:
$ god stop resque
Sending 'stop' command

The following watches were affected:
  resque-0

上帝给这个日志输出:
I [2011-03-27 22:51:22]  INFO: resque-0 stop: default lambda killer
I [2011-03-27 22:51:22]  INFO: resque-0 sent SIGTERM
I [2011-03-27 22:51:23]  INFO: resque-0 process stopped
I [2011-03-27 22:51:23]  INFO: resque-0 move 'up' to 'unmonitored'
I [2011-03-27 22:51:23]  INFO: resque-0 moved 'up' to 'unmonitored'

但它实际上不会终止这两个进程,从而使实际的工作进程保持活动状态:
$ ps x | grep resque
 6864   ??  S      0:05.15 resque-1.13.0: Waiting for builds
 6858 s001  S+     0:01.36 /Volumes/Users/sven/.rvm/rubies/ruby-1.8.7-p302/bin/ruby /Volumes/Users/sven/.rvm/gems/ruby-1.8.7-p302/bin/god -c config/resque.god -D

最佳答案

你需要告诉上帝使用救援生成的pid文件并设置pid文件

w.env = {'PIDFILE' => '/path/to/resque.pid'}
w.pid_file = '/path/to/resque.pid'

env将告诉rescue编写pid文件,而pid_文件将告诉上帝使用它
同样,正如斯文福斯所指出的,仅设置适当的env就足够了:
w.env = { 'PIDFILE' => "/home/travis/.god/pids/#{w.name}.pid" }

其中/home/travis/.god/pids是默认的pids目录

10-01 07:16