问题描述
我在GKE上部署了一个新应用,我发现GKE仪表板在gke-metrics-agent上有成千上万个错误:
I deployed a new app to GKE, I see the GKE dashboard has thousands of errors on gke-metrics-agent:
它占用大量资源.
我检查了日志,并看到了与Prometheus相关的所有错误,但是我没有找到解决这些错误的方法:
I checked the logs, and I saw all errors related to Prometheus, but I didn't find a way to troubleshoot these errors:
集群版本:1.18.12-gke.1206
cluster version:1.18.12-gke.1206
这些错误是什么,我该如何解决?
What are these errors, and how I can fix it?
推荐答案
某些GKE 1.18.12-gke-X
版本似乎存在错误,其中 gke-metrics-agent
会产生很多警告
消息.
It looks like some GKE 1.18.12-gke-X
versions have bug where gke-metrics-agent
produces a lot of Warning
messages.
此错误已有 Public Issue Tracker
票证.您可以在此处上关注有关此问题的更新.您还可以使用(+ 1)
表示您受到此错误的影响.
There is already a Public Issue Tracker
ticket for this bug. You can follow updates regarding this issue here. You can also use (+1)
to indicate that you are affected by this bug.
此问题的解决方法是使用较新的版本- 1.18.14-gke.1200 +
Workaround of this issue is to use newer version - 1.18.14-gke.1200+
这篇关于与Prometheus相关的gke-metrics-agent多重错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!