本文介绍了使用外部度量标准配置水平Pod自动定标器很困难的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试配置水平Pod自动缩放器",以根据连接的GPU的占空比缩放部署.

I'm attempting to configure a Horizontal Pod Autoscaler to scale a deployment based on the duty cycle of attached GPUs.

我正在使用GKE,而我的Kubernetes主版本是1.10.7-gke.6.

I'm using GKE, and my Kubernetes master version is 1.10.7-gke.6 .

我正在通过 https:/编写本教程. /cloud.google.com/kubernetes-engine/docs/tutorials/external-metrics-autoscaling .特别是,我运行了以下命令来设置自定义指标:

I'm working off the tutorial at https://cloud.google.com/kubernetes-engine/docs/tutorials/external-metrics-autoscaling . In particular, I ran the following command to set up custom metrics:

kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml

这似乎有效,或者至少我可以在/apis/custom.metrics.k8s.io/v1beta1中访问指标列表.

This appears to have worked, or at least I can access a list of metrics at /apis/custom.metrics.k8s.io/v1beta1 .

这是我的YAML:

apiVersion: autoscaling/v2beta1                                            
kind: HorizontalPodAutoscaler                                              
metadata:                                                                  
  name: images-srv-hpa                                                     
spec:                                                                      
  minReplicas: 1                                                           
  maxReplicas: 10                                                          
  metrics:                                                                 
  - type: External                                                         
    external:                                                              
      metricName: container.googleapis.com|container|accelerator|duty_cycle
      targetAverageValue: 50                                               
  scaleTargetRef:                                                          
    apiVersion: apps/v1                                                    
    kind: Deployment                                                       
    name: images-srv-deployment

我认为metricName存在,是因为它在/apis/custom.metrics.k8s.io/v1beta1中列出,并且在 https://cloud.google.com/monitoring/api/metrics_gcp .

I believe that the metricName exists because it's listed in /apis/custom.metrics.k8s.io/v1beta1 , and because it's described on https://cloud.google.com/monitoring/api/metrics_gcp .

这是我在描述HPA时遇到的错误:

This is the error I get when describing the HPA:

  Type     Reason                        Age               From                       Message
  ----     ------                        ----              ----                       -------
  Warning  FailedGetExternalMetric       18s (x3 over 1m)  horizontal-pod-autoscaler  unable to get external metric prod/container.googleapis.com|container|accelerator|duty_cycle/nil: no metrics returned from external metrics API
  Warning  FailedComputeMetricsReplicas  18s (x3 over 1m)  horizontal-pod-autoscaler  failed to get container.googleapis.com|container|accelerator|duty_cycle external metric: unable to get external metric prod/container.googleapis.com|container|accelerator|duty_cycle/nil: no metrics returned from external metrics API

我真的不知道该如何调试.有人知道什么地方可能出问题了,或者我下一步该怎么做?

I don't really know how to go about debugging this. Does anyone know what might be wrong, or what I could do next?

推荐答案

一旦我将系统置于负载下,此问题就自动消失了.使用相同的配置,现在可以正常工作.

This problem went away on its own once I placed the system under load. It's working fine now with the same configuration.

我不确定为什么.我最好的猜测是,StackMetrics直到达到1%以上才报告占空比值.

I'm not sure why. My best guess is that StackMetrics wasn't reporting a duty cycle value until it went above 1%.

这篇关于使用外部度量标准配置水平Pod自动定标器很困难的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-17 17:58