本文介绍了如何减少在Kubernetes上检测节点故障时间的时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个Slave和1个Master节点kubernetes集群.当一个节点关闭时,大约需要5分钟才能看到kubernetes失败.我正在为卷使用动态预配置,这次对我来说有点多了.我该怎么办?减少检测故障的时间?我发现了一个关于它的帖子: https://fatalfailure.wordpress.com/2016/06/10/improving-kubernetes-reliability-quick-detection-of-a-node-down/

I have 2 Slave and 1 Master node kubernetes cluster.When a node down it takes approximately 5 minutes to kubernetes see that failure.I am using dynamic provisioning for volumes and this time is a little bit much for me.How can i reduce that detecting failure time ?I found a post about it:https://fatalfailure.wordpress.com/2016/06/10/improving-kubernetes-reliability-quicker-detection-of-a-node-down/

在文章的底部,我们可以通过更改参数来减少检测时间:

At the bottom of the post,it says, we can reduce that detection time by changing that parameters:

kubelet:node-status-update-frequency = 4s(从10s开始)
控制器管理器:node-monitor-period = 2s(从5s开始)
控制器管理员:node-monitor-grace-period = 16s(从40s开始)
控制器管理员:pod-eviction-timeout = 30s(从5m开始)

kubelet: node-status-update-frequency=4s (from 10s)
controller-manager: node-monitor-period=2s (from 5s)
controller-manager: node-monitor-grace-period=16s (from 40s)
controller-manager: pod-eviction-timeout=30s (from 5m)

我可以从kubelet更改node-status-update-frequency参数,但是在cli上没有任何控制器管理器程序或命令.如何更改这些参数?关于减少检测停机时间的任何其他建议将不胜感激.

i can change node-status-update-frequency parameter from kubelet but i don't have any controller manager program or command on the cli.How can i change that parameters? Any other suggestions about reducing detect downtime will be appreciated.

推荐答案

您可以在controller-manger系统单元文件中更改/添加该参数,然后重新启动守护程序.请在controller-manager 此处.

You can change/add that parameter in controller-manger systemd unit file and restart the daemon. Please check the man pages for controller-manager here.

如果将controller-manager部署为微服务(pod),请检查该pod的清单文件,并在容器的command部分更改参数(例如,例如)

If you deploy controller-manager as micro service(pod), check the manifest file for that pod and change the parameters at container's command section(For example like this)

这篇关于如何减少在Kubernetes上检测节点故障时间的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 12:55