本文介绍了主厨-客户快要死了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们最近开始在运行中遇到厨师-客户死亡的麻烦,这是因为在运行列表的各个部分上花费了更多时间,而这些时间通常会更快地进行。我去过家庭wifi,而同事一直在工作wifi,该wifi本身一直存在一些连接问题。



如果您的ssh连接被中断在Chef-client运行时运行到机器上,这是否会使运行以似乎无法解释的方式崩溃?我正在使用PutTY从Win7进行连接,而我的同事正在使用Apple Terminal App。



我们一直在运行的所有计算机都是Ubuntu 12.04(在EC2中),并有足够的磁盘空间-它们仅使用〜1GB并提供〜5GB可用空间。



这是 /var/log/chef/client.log (通过 / etc / chef / client中的 log_location 指令设置.rb 作为)。

  [2014-01-08T00:27:07 + 00:00]警告: Node.js用户为node.js 
[2014-01-08T00:27:07 + 00:00]警告:从先前资源(CHEF-3694)
中克隆组[node.js]的资源属性[2014-01- 08T00:27:07 + 00:00]警告:上一个组[nodejs]:/var/chef/cache/cookbooks/nodejs/recipes/default.rb:26:in`from_file'
[2014-01- 08T00:27:07 + 00:00] WA RN:当前组[nodejs]:/var/chef/cache/cookbooks/spicoli-app/recipes/default.rb:38:in`from_file'
[2014-01-08T00:27:07 + 00: 00]警告:从先前资源(CHEF-3694)
[2014-01-08T00:27:07 + 00:00]克隆用户[nodejs]的资源属性警告:先前的用户[nodejs]:/ var / Chef / cache / cookbooks / nodejs / recipes / default.rb:34:in from_file'
[2014-01-08T00:27:07 + 00:00]警告:当前用户[nodejs]:/ var / Chef / cache / cookbooks / spicoli-app / recipes / default.rb:46:in'from_file'
[2014-01-08T00:27:30 + 00:00]警告:环境为_default
[2014-01-08T00:27:30 + 00:00]警告:Nodejs用户为nodejs
[2014-01-08T02:04:54 + 00:00]错误:正在运行异常处理程序
[ 2014-01-08T02:04:54 + 00:00]错误:异常处理程序完成
[2014-01-08T02:04:54 + 00:00]致命:Stacktrace转储至/ var / chef / cache / Chef-stacktrace.out
[2014-01-08T02:04:55 + 00:00]错误:输入/输出错误-< STDOUT>
[2014-01-08T02:04:57 + 00:00]致命:Chef :: Exceptions :: ChildConvergeError:Chef运行过程未成功退出(退出代码1)

错误堆栈跟踪只是这样:

 生成于2014-01-08 02:04:54 +0000 
Errno :: EIO:输入/输出错误-< STDOUT>
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in'write'
/ opt / chef / embedded / lib / ruby​​ / gems / 1.9.1 / gems / chef-11.8.0 / lib / chef / formatters / base.rb:91:in'puts'
/ opt / chef / embedded /lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in`puts'
/ opt / chef / embedded / lib / ruby​​ / gems / 1.9.1 / gems / chef-11.8.0 / lib / chef / formatters / error_descriptor.rb:61:在'display_section'
/opt/chef/embedded/lib/ruby/gems/1.9.1 /gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:44:in`block(2 level)in display'
/opt/chef/embedded/lib/ruby/gems/1.9。 1 / gems / chef-11.8.0 / lib / chef / formatters / error_descriptor.rb:43:在`each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef中-11.8.0 / lib / chef / formatters / error_descriptor.rb:43:在'block in display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8中。 0 / lib / chef / formatters / error_descriptor.rb:42:在`each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef中/格式ers / error_descriptor.rb:42:在'display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb中:130:in'display_error'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:161:in` resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/doc.rb:159:in`resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in'resource_failed中的块'
/ opt /chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in`each'
/ opt / chef / embedded / lib / ruby​​ / gems / 1.9.1 / gems / chef-11.8.0 / lib / chef / event_dispatch / dispatcher.rb:29:in'resource_failed'
/ opt / chef / embedded / lib / ruby​​ / gems /1.9.1/gems/chef-11.8.0/lib/chef/resource.rb:637:在`营救中run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/ gems / chef-11.8.0 / lib / chef / resource.rb:643:在'run_action'
/ opt / chef / embedded / lib / ruby​​ / gems / 1中。 9.1 / gems / chef-11.8.0 / lib / chef / runner.rb:49:在`run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8中.0 / lib / chef / runner.rb:81:在`block(2 level)in converge'中
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8。 0 / lib / chef / runner.rb:81:在`each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner中.rb:81:在``聚合中的块''中
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection.rb:98:在`execute_each_resource中的块'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:116:in`call中'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:116:in`call_iterator_block'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:85:在`step'
/ opt / chef /嵌入式/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_itera tor.rb:104:在'iterate'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:55 :在`each_with_index'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection.rb:96:在`execute_each_resource'$ b中$ b /opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:80:在`converge'中
/ opt / chef /嵌入式/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:433:在'converge'
/ opt / chef / embedded / lib / ruby​​ / gems /1.9.1/gems/chef-11.8.0/lib/chef/client.rb:500:在`do_run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/ Chef-11.8.0 / lib / chef / client.rb:199:在'运行中阻止'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0 /lib/chef/client.rb:193:在'fork'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client中。 rb:193:在`run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application.rb:208:在`run_chef_client '
/ opt / chef / embedded / lib / ruby / gems / 1.9.1 / gems / chef-11.8.0 / lib / chef / application / client.rb:312:在`run_application'中的块
/ opt / chef / embedded / lib / ruby​​ / gems /1.9.1/gems/chef-11.8.0/lib/chef/application/client.rb:304:在`loop'
/opt/chef/embedded/lib/ruby/gems/1.9.1/中gems / chef-11.8.0 / lib / chef / application / client.rb:304:在'run_application'中
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8 .0 / lib / chef / application.rb:66:在`run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/bin/chef-客户端:26:在`< top(必填)>'中
/ usr / bin / chef-client:23:在`load'
/ usr / bin / chef-client:23:in `< main>'

这是一个真正的通用错误!但这似乎表明STDOUT输出已中断,这与客户端断开连接有意义。



编辑:根据要求,请在此处是 client.rb 文件的内容(名称自然会被混淆)。

  $ cat /etc/chef/client.rb 
log_level:auto
log_location /var/log/chef/client.log
Chef_server_url https://api.opscode .com / organizations / myapp
validate_client_name my-validator
node_name my-app-node

编辑2:使用 sudo su -s / bin / bash root -c屏幕厨师客户端 尝试



我午饭时屏幕终止,并在 ShellOut 命令中为记录了超时npm install 。这是在厨师客户坐了一个多小时之后。

  [2014-01-09T16:39: 07 + 00:00]警告:环境为_default 
[2014-01-09T16:39:07 + 00:00]警告:Nodejs用户为nodejs
[2014-01-09T18:16:28 +00:00]错误:运行异常处理程序
[2014-01-09T18:16:28 + 00:00]错误:异常处理程序完成
[2014-01-09T18:16:28 + 00 :00]致命:Stacktrace转储到/var/chef/cache/chef-stacktrace.out
[2014-01-09T18:16:31 + 00:00]错误:execute [npm-install-app]( spicoli-app ::默认行110)出现错误:Mixlib :: ShellOut :: CommandTimeout:命令超时:
----开始输出npm --registry http://my.npm.repo。 amazonaws.com:5984/registry/_design/app/_rewrite install --cache /home/nodejs/.npm --tmp / home / nodejs / tmp

--- snip:从npm安装消息---

[2014-01-09T18:16:33 + 00:00]致命:Chef :: Exceptions :: ChildConvergeError:Chef运行过程未成功退出(退出代码1)

这是与以前完全不同的错误。 stacktrace.out 文件还明确提到了 ShellOut ,因此它也完全不同。最奇怪的是,当我从命令行运行相同的npm命令时,在一分钟内即可完成。



所以我不确定是否有进一步诊断的方法以前的失败,但我欢迎其他建议。对于此新故障的输入,我询问了。

解决方案

嗯,stacktrace似乎暗示着这种情况正在发生。消息显示 Errno :: EIO:输入/输出错误-< STDOUT> ,这与我希望看到STDOUT是否通过SSH一致我建议两件事:




  • 运行 chef-client ,并将所有控制台输出重定向到文件;例如添加> / tmp / log 2>& 1 到命令末尾。 (重定向需要在远程计算机上发生 。)


  • 添加 -l debug 添加到命令以提高日志记录级别,如。这可能会揭示当前隐藏的线索。







您的第二次更新,具有某些与防火墙或网络相关的问题。


We recently started having trouble with chef-client dying in the middle of a run after taking a lot more time stuck on various parts of the run-list that normally proceeded much quicker. I've been on my home wifi and my colleague has been on the work wifi, which has been having some connectivity problems of its own.

If your ssh connection gets interrupted to a machine while chef-client is running, does that crash the run in seemingly inexplicable ways? I am using PutTY to connect from my Win7 and my colleague is using the Apple Terminal App.

All the machines we've been running this on are Ubuntu 12.04 (in EC2) and have plenty of disk space left over - they're only utilizing ~1GB with ~5GB free.

Here is the output of the log from /var/log/chef/client.log (set with the log_location directive in /etc/chef/client.rb as described here).

[2014-01-08T00:27:07+00:00] WARN: Nodejs user is nodejs
[2014-01-08T00:27:07+00:00] WARN: Cloning resource attributes for group[nodejs] from prior resource (CHEF-3694)
[2014-01-08T00:27:07+00:00] WARN: Previous group[nodejs]: /var/chef/cache/cookbooks/nodejs/recipes/default.rb:26:in `from_file'
[2014-01-08T00:27:07+00:00] WARN: Current  group[nodejs]: /var/chef/cache/cookbooks/spicoli-app/recipes/default.rb:38:in `from_file'
[2014-01-08T00:27:07+00:00] WARN: Cloning resource attributes for user[nodejs] from prior resource (CHEF-3694)
[2014-01-08T00:27:07+00:00] WARN: Previous user[nodejs]: /var/chef/cache/cookbooks/nodejs/recipes/default.rb:34:in `from_file'
[2014-01-08T00:27:07+00:00] WARN: Current  user[nodejs]: /var/chef/cache/cookbooks/spicoli-app/recipes/default.rb:46:in `from_file'
[2014-01-08T00:27:30+00:00] WARN: Environment is _default
[2014-01-08T00:27:30+00:00] WARN: Nodejs user is nodejs
[2014-01-08T02:04:54+00:00] ERROR: Running exception handlers
[2014-01-08T02:04:54+00:00] ERROR: Exception handlers complete
[2014-01-08T02:04:54+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2014-01-08T02:04:55+00:00] ERROR: Input/output error - <STDOUT>
[2014-01-08T02:04:57+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

And the error stacktrace just has this:

Generated at 2014-01-08 02:04:54 +0000
Errno::EIO: Input/output error - <STDOUT>
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in `write'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in `puts'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:91:in `puts'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:61:in `display_section'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:44:in `block (2 levels) in display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:43:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:43:in `block in display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:42:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/error_descriptor.rb:42:in `display'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:130:in `display_error'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/base.rb:161:in `resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/formatters/doc.rb:159:in `resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in `block in resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/event_dispatch/dispatcher.rb:29:in `resource_failed'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource.rb:637:in `rescue in run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource.rb:643:in `run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:49:in `run_action'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:81:in `block (2 levels) in converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:81:in `each'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:81:in `block in converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection.rb:98:in `block in execute_each_resource'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:116:in `call'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:116:in `call_iterator_block'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:85:in `step'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:104:in `iterate'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection/stepable_iterator.rb:55:in `each_with_index'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/resource_collection.rb:96:in `execute_each_resource'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/runner.rb:80:in `converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:433:in `converge'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:500:in `do_run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:199:in `block in run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:193:in `fork'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/client.rb:193:in `run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application.rb:208:in `run_chef_client'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application/client.rb:312:in `block in run_application'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application/client.rb:304:in `loop'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application/client.rb:304:in `run_application'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/lib/chef/application.rb:66:in `run'
/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/chef-11.8.0/bin/chef-client:26:in `<top (required)>'
/usr/bin/chef-client:23:in `load'
/usr/bin/chef-client:23:in `<main>'

Which is a really generic error! But it does seem to indicate an interruption to STDOUT output, which kind of makes sense with a client disconnection.

Edit: As requested, here are the contents of the client.rb file (names obfuscated, naturally.)

$ cat /etc/chef/client.rb
log_level        :auto
log_location     "/var/log/chef/client.log"
chef_server_url  "https://api.opscode.com/organizations/myapp"
validation_client_name "my-validator"
node_name "my-app-node"

Edit 2: Attempt using sudo su -s /bin/bash root -c "screen chef-client"

Screen terminated while I was at lunch and recorded a timeout on the ShellOut command for npm install. This was after chef-client was sitting stuck on this operation for over an hour.

[2014-01-09T16:39:07+00:00] WARN: Environment is _default
[2014-01-09T16:39:07+00:00] WARN: Nodejs user is nodejs
[2014-01-09T18:16:28+00:00] ERROR: Running exception handlers
[2014-01-09T18:16:28+00:00] ERROR: Exception handlers complete
[2014-01-09T18:16:28+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2014-01-09T18:16:31+00:00] ERROR: execute[npm-install-app] (spicoli-app::default line 110) had an error: Mixlib::ShellOut::CommandTimeout: command timed out:
---- Begin output of npm --registry http://my.npm.repo.amazonaws.com:5984/registry/_design/app/_rewrite install --cache /home/nodejs/.npm --tmp /home/nodejs/tmp

--- snip: install messages from npm ---

[2014-01-09T18:16:33+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

This is a totally different error than before. The stacktrace.out file also explicitly mentions ShellOut, so it is entirely different as well. Most oddly, when I run the same npm command from the command line, in finishes in under a minute.

So I'm not sure there is a way to further diagnose the previous failure, but I would welcome other suggestions. For input on this new failure, I asked this followup question.

解决方案

Well, the stacktrace seems to imply that something like that is happening. The message says "Errno::EIO: Input/output error - <STDOUT>" which is consistent with what I'd expect to see if STDOUT was going over an SSH channel that had been closed.

I suggest 2 things:

  • Run chef-client with all console output redirected to a file; e.g. add > /tmp/log 2>&1 to the end of the command. (The redirection needs to happen on the remote machine.)

  • Add -l debug to the command to increase the level of logging, as covered in Opscode's technical FAQ. This could reveal clues that are currently being hidden.


Looking at your second update, this has the hallmarks of some kind of firewall or network related problem.

这篇关于主厨-客户快要死了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-20 21:49