本文介绍了Kubernetes自动缩放器-NotTriggerScaleUp'pod不会触发放大(如果添加了新节点则不适合)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想在每个节点上运行一个作业",一次在一个节点上运行一个豆荚.
I'd like to run a 'job' per node, one pod on a node at a time.
- 我已经安排了很多工作
- 我现在有一大堆待处理的豆荚
- 我希望这些待处理的Pod现在触发节点放大事件(否会发生)
- I've scheduled a bunch of jobs
- I have a whole bunch of pending pods now
- I'd like these pending pods to now trigger a node scaling up event (which does NOT happen)
非常像这个问题(由我自己制作):
Very much like this issue (made by myself): Kubernetes reports "pod didn't trigger scale-up (it wouldn't fit if a new node is added)" even though it would?
但是在这种情况下,它确实应该适合新节点.
However in this case it should indeed fit on a new node.
如何诊断Kubernetes为什么确定不可能发生节点扩展事件?
How can I diagnose why Kubernetes has determined that a node scaling event is not possible?
我的工作Yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: example-job-${job_id}
labels:
job-in-progress: job-in-progress-yes
spec:
template:
metadata:
name: example-job-${job_id}
spec:
# this bit ensures a job/container does not get scheduled along side another - 'anti' affinity
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: kubernetes.io/hostname
labelSelector:
matchExpressions:
- key: job-in-progress
operator: NotIn
values:
- job-in-progress-yes
containers:
- name: buster-slim
image: debian:buster-slim
command: ["bash"]
args: ["-c", "sleep 60; echo ${echo_param}"]
restartPolicy: Never
自动定标器日志:
I0920 19:27:58.190751 1 static_autoscaler.go:128] Starting main loop
I0920 19:27:58.261972 1 auto_scaling_groups.go:320] Regenerating instance to ASG map for ASGs: []
I0920 19:27:58.262003 1 aws_manager.go:152] Refreshed ASG list, next refresh after 2019-09-20 19:28:08.261998185 +0000 UTC m=+302.102284346
I0920 19:27:58.262092 1 static_autoscaler.go:261] Filtering out schedulables
I0920 19:27:58.264212 1 static_autoscaler.go:271] No schedulable pods
I0920 19:27:58.264246 1 scale_up.go:262] Pod default/example-job-21-npv6p is unschedulable
I0920 19:27:58.264252 1 scale_up.go:262] Pod default/example-job-28-zg4f8 is unschedulable
I0920 19:27:58.264258 1 scale_up.go:262] Pod default/example-job-24-fx9rd is unschedulable
I0920 19:27:58.264263 1 scale_up.go:262] Pod default/example-job-6-7mvqs is unschedulable
I0920 19:27:58.264268 1 scale_up.go:262] Pod default/example-job-20-splpq is unschedulable
I0920 19:27:58.264273 1 scale_up.go:262] Pod default/example-job-25-g5mdg is unschedulable
I0920 19:27:58.264279 1 scale_up.go:262] Pod default/example-job-16-wtnw4 is unschedulable
I0920 19:27:58.264284 1 scale_up.go:262] Pod default/example-job-7-g89cp is unschedulable
I0920 19:27:58.264289 1 scale_up.go:262] Pod default/example-job-8-mglhh is unschedulable
I0920 19:27:58.264323 1 scale_up.go:304] Upcoming 0 nodes
I0920 19:27:58.264370 1 scale_up.go:420] No expansion options
I0920 19:27:58.264511 1 static_autoscaler.go:333] Calculating unneeded nodes
I0920 19:27:58.264533 1 utils.go:474] Skipping ip-10-0-1-118.us-west-2.compute.internal - no node group config
I0920 19:27:58.264542 1 utils.go:474] Skipping ip-10-0-0-65.us-west-2.compute.internal - no node group config
I0920 19:27:58.265063 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-25-g5mdg", UID:"d2e0e48c-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7256", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265090 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-8-mglhh", UID:"c7d3ce78-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7267", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265101 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-6-7mvqs", UID:"c6a5b0e4-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7273", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265110 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-20-splpq", UID:"cfeb9521-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7259", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265363 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-21-npv6p", UID:"d084c067-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7275", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265384 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-16-wtnw4", UID:"ccbe48e0-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7265", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265490 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-28-zg4f8", UID:"d4afc868-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7269", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265515 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-24-fx9rd", UID:"d24975e5-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7271", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
I0920 19:27:58.265685 1 static_autoscaler.go:360] Scale down status: unneededOnly=true lastScaleUpTime=2019-09-20 19:23:23.822104264 +0000 UTC m=+17.662390361 lastScaleDownDeleteTime=2019-09-20 19:23:23.822105556 +0000 UTC m=+17.662391653 lastScaleDownFailTime=2019-09-20 19:23:23.822106849 +0000 UTC m=+17.662392943 scaleDownForbidden=false isDeleteInProgress=false
I0920 19:27:58.265910 1 factory.go:33] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"example-job-7-g89cp", UID:"c73cfaea-dbd9-11e9-a9e2-024e7db9d360", APIVersion:"v1", ResourceVersion:"7263", FieldPath:""}): type: 'Normal' reason: 'NotTriggerScaleUp' pod didn't trigger scale-up (it wouldn't fit if a new node is added):
推荐答案
我在自动缩放器上定义了错误的参数:
I had the wrong parameters defined on the autoscaler:
我不得不修改node-group-auto-discovery
和nodes
参数.
- ./cluster-autoscaler
- --cloud-provider=aws
- --namespace=default
- --scan-interval=25s
- --scale-down-unneeded-time=30s
- --nodes=1:20:terraform-eks-demo20190922161659090500000007--terraform-eks-demo20190922161700651000000008
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/example-job-runner
- --logtostderr=true
- --stderrthreshold=info
- --v=4
这篇关于Kubernetes自动缩放器-NotTriggerScaleUp'pod不会触发放大(如果添加了新节点则不适合)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!