4台机器的集群两台centos,两台麒麟v10
问题1.执行到集群和节点加入集群时候报错
ERRO[18:48:06 CST] Failed to add master to cluster: Failed to exec command: sudo env PATH=$PATH:/sbin:/usr/sbin /bin/sh -c “/usr/local/bin/kubeadm join --config=/etc/kubernetes/kubeadm-config.yaml”
Please, check the contents of the H O M E / . k u b e / c o n f i g f i l e . E R R O [ 22 : 15 : 27 C S T ] F a i l e d t o a d d m a s t e r t o c l u s t e r : F a i l e d t o e x e c c o m m a n d : s u d o e n v P A T H = HOME/.kube/config file. ERRO[22:15:27 CST] Failed to add master to cluster: Failed to exec command: sudo env PATH= HOME/.kube/configfile.ERRO[22:15:27CST]Failedtoaddmastertocluster:Failedtoexeccommand:sudoenvPATH=PATH:/sbin:/usr/sbin /bin/sh -c “/usr/local/bin/kubeadm join --config=/etc/kubernetes/kubeadm-config.yaml”
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster…
[preflight] FYI: You can look at this config file with ‘kubectl -n kube-system get cm kubeadm-config -o yaml’
error execution phase preflight: unable to fetch the kubeadm-config ConfigMap: failed to get config map: Get “https://lb.kubesphere.local:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s”: x509: certificate has expired or is not yet valid: current time 2023-12-04T14:18:43+08:00 is before 2023-12-04T14:14:50Z
To see the stack trace of this error execute with --v=5 or higher: Process exited with status 1 node=192.168.0.173
ERRO[22:15:27 CST] Failed to add master to cluster: Failed to exec command: sudo env PATH=$PATH:/sbin:/usr/sbin /bin/sh -c “/usr/local/bin/kubeadm join --config=/etc/kubernetes/kubeadm-config.yaml”
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster…
我初步认为是没权限,没有切换到默认root用户上,因为默认sudo用户的当前目录中没有 .docker文件夹
当我把普通用户换成了root用户后还是报错,且没有生成.docker文件夹
我意外的访问官网发现一个错误,那就是时间同步问题,经过时间同步后解决
这里就截图时间同步了,由于大部分机器自带chronyd时间同步,所以采用该同步方式解决
问题2flannel启动报错,只有一台起来了,另外三台报错,状态为CrashLoopBackOff
[root@bu170 ~]# kubectl logs -f kube-flannel-ds-kqp9h -n kube-system
I1204 06:33:02.940977 1 main.go:518] Determining IP address of default interface
E1204 06:33:02.941113 1 main.go:204] Failed to find any valid interface to use: failed to get default interface: Unable to find default route
装完后发现coredns与flannel都没起来
coredns说没有cni0,那么问题直接定位到flannel了,因为flannel创建时候会顺带创建flannel与cni0
找不到默认的路由,直接去看网卡配置是否有gateway配置,发现只有一张网卡配置了gateway
于是乎另外全部加上gateway
centos8重启网卡 nmcli c reload
麒麟10系统重启网卡 systemctl restart network
当都重启网络时候
悲催的发型先前有gateway的起不来了,先前三台好了
删除flannel.1网卡与cni0网卡
ip link del flannel.1
ip link del cni0
再重启network就全部ok了
至此全部启动