

突然之间,我无法部署一些以前可以部署的图像.我得到以下 pod 状态:

All of a sudden, I cannot deploy some images which could be deployed before. I got the following pod status:

[root@webdev2 origin]# oc get pods
NAME                      READY     STATUS             RESTARTS   AGE
arix-3-yjq9w              0/1       ImagePullBackOff   0          10m
docker-registry-2-vqstm   1/1       Running            0          2d
router-1-kvjxq            1/1       Running            0          2d

应用程序无法启动.pod 没有尝试运行容器.从事件页面,我得到了 Back-off pull image "我已经验证我可以使用 docker pull 拉取带有标签的图像.

The application just won't start. The pod is not trying to run the container. From the Event page, I have got Back-off pulling image " I have verified that I can pull the image with the tag with docker pull.

我还检查了最后一个容器的日志.由于某种原因它被关闭了.我认为 pod 至少应该尝试重新启动它.

I have also checked the log of the last container. It was closed for some reason. I think the pod should at least try to restart it.


I have run out of ideas to debug the issues. What can I check more?


您可以使用'describe pod'语法

You can use the 'describe pod' syntax

对于 OpenShift 使用:

oc describe pod <pod-id>

对于普通 Kubernetes:

kubectl describe pod <pod-id>

检查输出的事件.就我而言,它显示 Back-off pull image unreachableserver/nginx:1.14.22222

Examine the events of the output.In my case it shows Back-off pulling image unreachableserver/nginx:1.14.22222

这种情况下镜像unreachableserver/nginx:1.14.22222无法从网上拉取,因为没有Docker registry unreachableserver和镜像nginx:1.14.22222 不存在.

In this case the image unreachableserver/nginx:1.14.22222 can not be pulled from the Internet because there is no Docker registry unreachableserver and the image nginx:1.14.22222 does not exist.

注意:如果您没有看到任何感兴趣的事件并且 pod 已经处于 'ImagePullBackOff' 状态一段时间(似乎超过 60 分钟),您需要删除 pod 并查看来自新 Pod 的事件.

对于 OpenShift 使用:

oc delete pod <pod-id>
oc get pods
oc get pod <new-pod-id>

对于普通 Kubernetes:

kubectl delete pod <pod-id>
kubectl get pods
kubectl get pod <new-pod-id>


  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  32s                default-scheduler  Successfully assigned rk/nginx-deployment-6c879b5f64-2xrmt to aks-agentpool-x
  Normal   Pulling    17s (x2 over 30s)  kubelet            Pulling image "unreachableserver/nginx:1.14.22222"
  Warning  Failed     16s (x2 over 29s)  kubelet            Failed to pull image "unreachableserver/nginx:1.14.22222": rpc error: code = Unknown desc = Error response from daemon: pull access denied for unreachableserver/nginx, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
  Warning  Failed     16s (x2 over 29s)  kubelet            Error: ErrImagePull
  Normal   BackOff    5s (x2 over 28s)   kubelet            Back-off pulling image "unreachableserver/nginx:1.14.22222"
  Warning  Failed     5s (x2 over 28s)   kubelet            Error: ImagePullBackOff


  1. 尝试在您的计算机上手动拉取 docker 镜像和标记
  2. 通过执行kubectl/oc get pods -o wide"来识别节点
  3. ssh 进入无法拉取 docker 镜像的节点(如果可以的话)
  4. 通过执行 ping 检查节点是否可以解析 docker 注册表的 DNS.
  5. 尝试在节点上手动拉取 docker 镜像
  6. 如果您使用的是私有注册表,请检查您的 secret 存在并且该秘密是正确的.你的秘密也应该在同一个命名空间中.谢谢 swenzel
  7. 某些注册管理机构设有限制 IP 地址访问的防火墙.防火墙可能会阻止拉取
  8. 某些 CI 创建具有临时 docker 机密的部署.所以这个秘密会在几天后过期(你要求生产失败......)
  1. try to pull the docker image and tag manually on your computer
  2. Identify the node by doing a 'kubectl/oc get pods -o wide'
  3. ssh into the node (if you can) that can not pull the docker image
  4. check that the node can resolve the DNS of the docker registry by performing a ping.
  5. try to pull the docker image manually on the node
  6. If you are using a private registry, check that your secret exists and the secret is correct. Your secret should also be in the same namespace. Thanks swenzel
  7. Some registries have firewalls that limit ip address access. The firewall may block the pull
  8. Some CIs create deployments with temporary docker secrets. So the secret expires after a few days (You are asking for production failures...)


08-03 19:03