问题描述
让我描述一下我的场景:
Let me describe my scenario:
当我在带有 1 个附加卷的 Kubernetes 上创建部署时,一切正常.当我创建相同的部署,但附加了第二个卷(总计:2 个卷)时,pod 卡在待处理"状态并出现错误:
When I create a deployment on Kubernetes with 1 attached volume, everything works perfectly. When I create the same deployment, but with a second volume attached (total: 2 volumes), the pod gets stuck on "Pending" with errors:
pod has unbound PersistentVolumeClaims (repeated 2 times)
0/2 nodes are available: 2 node(s) had no available volume zone.
已检查是否在正确的可用区域中创建了卷.
Already checked that the volumes are created in the correct availability zones.
我使用 Amazon EKS 设置了一个具有 2 个节点的集群.我有以下默认存储类:
I have a cluster set up using Amazon EKS, with 2 nodes. I have the following default storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
我有一个 mongodb 部署,它需要两个卷,一个安装在 /data/db
文件夹中,另一个安装在我需要的某个随机目录中.这是用于创建三个组件的最小 yaml(我故意注释了一些行):
And I have a mongodb deployment which needs two volumes, one mounted on /data/db
folder, and the other mounted in some random directory I need. Here is an minimal yaml used to create the three components (I commented some lines on purpose):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: my-project
creationTimestamp: null
labels:
io.kompose.service: my-project-db-claim0
name: my-project-db-claim0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: my-project
creationTimestamp: null
labels:
io.kompose.service: my-project-db-claim1
name: my-project-db-claim1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: my-project
name: my-project-db
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
name: my-db
spec:
containers:
- name: my-project-db-container
image: mongo
imagePullPolicy: Always
resources: {}
volumeMounts:
- mountPath: /my_dir
name: my-project-db-claim0
# - mountPath: /data/db
# name: my-project-db-claim1
ports:
- containerPort: 27017
restartPolicy: Always
volumes:
- name: my-project-db-claim0
persistentVolumeClaim:
claimName: my-project-db-claim0
# - name: my-project-db-claim1
# persistentVolumeClaim:
# claimName: my-project-db-claim1
那个 yaml 工作得很好.卷的输出是:
That yaml works perfectly. The output for the volumes is:
$ kubectl describe pv
Name: pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6
Labels: failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1c
Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: gp2
Status: Bound
Claim: my-project/my-project-db-claim0
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 5Gi
Node Affinity: <none>
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://us-east-1c/vol-xxxxx
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
Name: pvc-308d8979-039e-11e9-b78d-0a68bcb24bc6
Labels: failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1b
Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: gp2
Status: Bound
Claim: my-project/my-project-db-claim1
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 10Gi
Node Affinity: <none>
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://us-east-1b/vol-xxxxx
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
和 pod 输出:
$ kubectl describe pods
Name: my-project-db-7d48567b48-slncd
Namespace: my-project
Priority: 0
PriorityClassName: <none>
Node: ip-192-168-212-194.ec2.internal/192.168.212.194
Start Time: Wed, 19 Dec 2018 15:55:58 +0100
Labels: name=my-db
pod-template-hash=3804123604
Annotations: <none>
Status: Running
IP: 192.168.216.33
Controlled By: ReplicaSet/my-project-db-7d48567b48
Containers:
my-project-db-container:
Container ID: docker://cf8222f15e395b02805c628b6addde2d77de2245aed9406a48c7c6f4dccefd4e
Image: mongo
Image ID: docker-pullable://mongo@sha256:0823cc2000223420f88b20d5e19e6bc252fa328c30d8261070e4645b02183c6a
Port: 27017/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 19 Dec 2018 15:56:42 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/my_dir from my-project-db-claim0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
my-project-db-claim0:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim0
ReadOnly: false
default-token-pf9ks:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pf9ks
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 7m22s (x5 over 7m23s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
Normal Scheduled 7m21s default-scheduler Successfully assigned my-project/my-project-db-7d48567b48-slncd to ip-192-168-212-194.ec2.internal
Normal SuccessfulMountVolume 7m21s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "default-token-pf9ks"
Warning FailedAttachVolume 7m13s (x5 over 7m21s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" : "Error attaching EBS volume "vol-01a863d0aa7c7e342"" to instance "i-0a7dafbbdfeabc50b" since volume is in "creating" state
Normal SuccessfulAttachVolume 7m1s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
Normal SuccessfulMountVolume 6m48s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
Normal Pulling 6m48s kubelet, ip-192-168-212-194.ec2.internal pulling image "mongo"
Normal Pulled 6m39s kubelet, ip-192-168-212-194.ec2.internal Successfully pulled image "mongo"
Normal Created 6m38s kubelet, ip-192-168-212-194.ec2.internal Created container
Normal Started 6m37s kubelet, ip-192-168-212-194.ec2.internal Started container
一切都是创造出来的,没有任何问题.但是,如果我取消注释 yaml 中的行,以便将两个卷附加到 db 部署,则 pv 输出与之前相同,但 pod 会因以下输出而卡在挂起状态:
Everything is created without any problems. But if I uncomment the lines in the yaml so two volumes are attached to the db deployment, the pv output is the same as earlier, but the pod gets stuck on pending with the following output:
$ kubectl describe pods
Name: my-project-db-b8b8d8bcb-l64d7
Namespace: my-project
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: name=my-db
pod-template-hash=646484676
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/my-project-db-b8b8d8bcb
Containers:
my-project-db-container:
Image: mongo
Port: 27017/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/data/db from my-project-db-claim1 (rw)
/my_dir from my-project-db-claim0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
my-project-db-claim0:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim0
ReadOnly: false
my-project-db-claim1:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim1
ReadOnly: false
default-token-pf9ks:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pf9ks
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 60s (x5 over 60s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
Warning FailedScheduling 2s (x16 over 59s) default-scheduler 0/2 nodes are available: 2 node(s) had no available volume zone.
我已经阅读了这两期:
可以在没有节点的可用区中创建 EBS 上的 PersistentVolume(已关闭)
但我已经检查过卷是在与集群节点实例相同的区域中创建的.事实上,EKS 默认在 us-east-1b
和 us-east-1c
区域创建两个 EBS,并且这些卷有效.发布的 yaml 创建的卷也在这些区域.
But I already checked that the volumes are created in the same zones as the cluster nodes instances. In fact, EKS creates two EBS by default in us-east-1b
and us-east-1c
zones and those volumes works. The volumes created by the posted yaml are on those regions too.
推荐答案
参见这篇文章:https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/
要点是您希望更新您的存储类以包括:
The gist is that you want to update your storageclass to include:
volumeBindingMode: WaitForFirstConsumer
这会导致在调度 pod 之前不会创建 PV.它为我解决了类似的问题.
This causes the PV to not be created until the pod is scheduled. It fixed a similar problem for me.
这篇关于附加新卷时 Kubernetes pod 挂起 (EKS)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!