集群升级后,三个主服务器之一无法连接回集群。我有一个在us-east-1a,us-east-1b和us-east-1c中运行的HA群集,在us-east-1a中运行的我的主服务器无法加入该群集。

我尝试将master-us-east-1a实例组缩减为零个节点,然后将其还原到一个节点,但是EC2机器启动时遇到了同样的问题,无法再次加入集群,似乎是从备份开始的或者其他的东西。

我试图连接到主机以重新启动服务,也许是protukube或docker,但我也无法解决问题。

在主服务器上通过ssh连接时,我注意到绒毛服务未在此计算机上运行。我尝试通过docker手动运行,但未成功。似乎法兰绒是应该运行的网络服务,而不应该运行。

  • 我可以重置us-east-1a的母版并从零创建它吗?
  • 关于在此主服务器上运行绒布服务的任何想法吗?

  • 提前致谢。

    附件
    > kubectl get nodes
    NAME                             STATUS     ROLES    AGE   VERSION
    ip-xxx-xxx-xxx-xxx.ec2.internal  Ready      node     33d   v1.11.9
    ip-xxx-xxx-xxx-xxx.ec2.internal  Ready      master   33d   v1.11.9
    ip-xxx-xxx-xxx-xxx.ec2.internal  Ready      node     33d   v1.11.9
    ip-xxx-xxx-xxx-xxx.ec2.internal  Ready      master   33d   v1.11.9
    ip-xxx-xxx-xxx-xxx.ec2.internal  Ready      node     33d   v1.11.9
    

    --
    > sudo systemctl status kubelet
    
    Jan 10 21:00:55 ip-xxx-xxx-xxx-xxx kubelet[2502]: I0110 21:00:55.026553    2502 kubelet_node_status.go:441] Recording NodeHasSufficientPID event message for node ip-xxx-xxx-xxx-xxx.ec2.internal
    Jan 10 21:00:55 ip-xxx-xxx-xxx-xxx kubelet[2502]: I0110 21:00:55.027005    2502 kubelet_node_status.go:79] Attempting to register node ip-xxx-xxx-xxx-xxx.ec2.internal
    Jan 10 21:00:55 ip-xxx-xxx-xxx-xxx kubelet[2502]: E0110 21:00:55.027764    2502 kubelet_node_status.go:103] Unable to register node "ip-xxx-xxx-xxx-xxx.ec2.internal" with API server: Post https://127.0.0.1/api/v1/nodes: dial tcp 127.0.0.1:443: connect: connection refused
    

    --
    > sudo docker logs k8s_kube-apiserver_kube-apiserver-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_134d55c1b1c3bf3583911989a14353da_16
    
    F0110 20:59:35.581865       1 storage_decorator.go:57] Unable to create storage backend: config (&{etcd3 /registry [http://127.0.0.1:4001]    true false 1000 0xc42013c480 <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:4001: connect: connection refused)
    

    --
    > sudo docker version
    
    Client:
     Version:      17.03.2-ce
     API version:  1.27
     Go version:   go1.7.5
     Git commit:   f5ec1e2
     Built:        Tue Jun 27 02:31:19 2017
     OS/Arch:      linux/amd64
    
    Server:
     Version:      17.03.2-ce
     API version:  1.27 (minimum version 1.12)
     Go version:   go1.7.5
     Git commit:   f5ec1e2
     Built:        Tue Jun 27 02:31:19 2017
     OS/Arch:      linux/amd64
     Experimental: false
    

    --
    > kubectl version
    
    Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.9", GitCommit:"16236ce91790d4c75b79f6ce96841db1c843e7d2", GitTreeState:"clean", BuildDate:"2019-03-25T06:40:24Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}
    The connection to the server 127.0.0.1 was refused - did you specify the right host or port?
    

    --
    > sudo docker images
    
    REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
    protokube                            1.15.0              6b00e7216827        7 weeks ago         288 MB
    k8s.gcr.io/kube-proxy                v1.11.9             e18fcce798b8        9 months ago        98.1 MB
    k8s.gcr.io/kube-controller-manager   v1.11.9             634ccbd18a0f        9 months ago        155 MB
    k8s.gcr.io/kube-apiserver            v1.11.9             ef9a84756d40        9 months ago        187 MB
    k8s.gcr.io/kube-scheduler            v1.11.9             e00d30bd3a71        9 months ago        56.9 MB
    k8s.gcr.io/pause-amd64               3.0                 99e59f495ffa        3 years ago         747 kB
    kopeio/etcd-manager                  3.0.20190930        7937b67f722f        50 years ago        656 MB
    

    --
    > sudo docker ps
    
    CONTAINER ID        IMAGE                                                                                                        COMMAND                  CREATED             STATUS              PORTS               NAMES
    b4eb0ec9e6a2        k8s.gcr.io/kube-scheduler@sha256:372ab1014701f60b67a65d94f94d30d19335294d98746edcdfcb8808ed5aee3c            "/bin/sh -c 'mkfif..."   15 hours ago        Up 15 hours                             k8s_kube-scheduler_kube-scheduler-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_105cd5bac4edf48f265f31eb756b971a_0
    8f827dc0eade        kopeio/etcd-manager@sha256:cb0ed7c56dadbc0f4cd515906d72b30094229d6e0a9fcb7aa44e23680bf9a3a8                  "/bin/sh -c 'mkfif..."   15 hours ago        Up 15 hours                             k8s_etcd-manager_etcd-manager-main-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_a6a467f6b78a7c7bc15ec1f64799516d_0
    5bebb169b8b3        k8s.gcr.io/kube-controller-manager@sha256:aa9b9dac085a65c47746fa8739cf70e9d7e9a356a836ad2ef073da0d7b136db2   "/bin/sh -c 'mkfif..."   15 hours ago        Up 15 hours                             k8s_kube-controller-manager_kube-controller-manager-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_564bccf38cd14aa0f647593e69b159ab_0
    4467d550824e        k8s.gcr.io/kube-proxy@sha256:a63c81fe4d3e9575cc0a29c4866a2975b01a07c0f473ab2cf1e88ebf78739f80                "/bin/sh -c 'mkfif..."   15 hours ago        Up 15 hours                             k8s_kube-proxy_kube-proxy-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_22cd6fe287e6f4bae556504b3245f385_0
    0a5c23006e18        kopeio/etcd-manager@sha256:cb0ed7c56dadbc0f4cd515906d72b30094229d6e0a9fcb7aa44e23680bf9a3a8                  "/bin/sh -c 'mkfif..."   15 hours ago        Up 15 hours                             k8s_etcd-manager_etcd-manager-events-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_9f2a8de168741a0263161532f42e97b4_0
    3efa9ae55618        k8s.gcr.io/pause-amd64:3.0                                                                                   "/pause"                 15 hours ago        Up 15 hours                             k8s_POD_kube-proxy-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_22cd6fe287e6f4bae556504b3245f385_0
    4e451bc007ac        k8s.gcr.io/pause-amd64:3.0                                                                                   "/pause"                 15 hours ago        Up 15 hours                             k8s_POD_kube-scheduler-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_105cd5bac4edf48f265f31eb756b971a_0
    7c5c301e034a        k8s.gcr.io/pause-amd64:3.0                                                                                   "/pause"                 15 hours ago        Up 15 hours                             k8s_POD_kube-apiserver-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_134d55c1b1c3bf3583911989a14353da_0
    d88f075fa61f        k8s.gcr.io/pause-amd64:3.0                                                                                   "/pause"                 15 hours ago        Up 15 hours                             k8s_POD_etcd-manager-main-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_a6a467f6b78a7c7bc15ec1f64799516d_0
    69e8844e9c14        k8s.gcr.io/pause-amd64:3.0                                                                                   "/pause"                 15 hours ago        Up 15 hours                             k8s_POD_kube-controller-manager-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_564bccf38cd14aa0f647593e69b159ab_0
    05e67c2e8f98        k8s.gcr.io/pause-amd64:3.0                                                                                   "/pause"                 15 hours ago        Up 15 hours                             k8s_POD_etcd-manager-events-ip-xxx-xxx-xxx-xxx.ec2.internal_kube-system_9f2a8de168741a0263161532f42e97b4_0
    eee0a4d563c0        protokube:1.15.0                                                                                             "/usr/bin/protokub..."   15 hours ago        Up 15 hours                             hungry_shirley
    

    最佳答案

    Kubelet尝试向API服务器端点https://127.0.0.1:443注册主节点us-east-1a。我认为这应该是其他两个主服务器中任何一个的API服务器端点。 Kubelet使用kubelet.conf文件与API服务器进行对话以注册节点。将位于server的kubelet.conf文件中的/etc/kubernetes更改为指向以下之一:

  • us-east-1b或us-east-1c上主节点的弹性IP或公共(public)IP,例如https://xx.xx.xx.xx:6443
  • 当前主节点us-east-1b或us-east-1c的私有(private)IP,例如https://xx.xx.xx.xx:6443
  • 当前主节点的
  • FQDN(如果您在运行kubernetes API服务器的主节点之前具有负载均衡器)。

  • 更改kubelet.conf后,重新启动kubelet。

    编辑:由于您正在使用etcd管理器,因此您可以尝试Kubernetes服务不可用/法兰绒问题排查步骤here

    关于amazon-web-services - 主节点无法连接到集群,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/59687345/

    10-11 07:57