• Upgrade k8s single master to multi-master cluster


    1 Original k8s envionment deatils

    hostnameiprolek8s versioncricsicnidocker verison
    www.kana001.com192.168.1.100masterv1.18.20dockerNfscalico17.03.2-ce
    www.kana002.com192.168.1.101workv1.18.20dockernfscalico17.03.2-ce
    www.kana003.com192.168.1.102workv1.18.20dockernfscalico17.03.2-ce

    prepare to add two new host as master node

    hostnameiprole
    www.cona001.com192.172.2.10master
    www.cona002.com192.172.2.11master
    apiserver-lb.com192.168.1.250VIP

    2 Install docker、kubeadm、kubectl、kubelet

    2.1 Install docker
    export docker relative rpm package

    [shutang@www.cona001.com docker]$ pwd
    /x/home/shutang/k8s/docker
    [shutang@www.cona001.com docker]$ ls
    containerd.io-1.2.6-3.3.el7.x86_64.rpm  container-selinux-2.119.2-1.911c772.el7_8.noarch.rpm  docker-ce-19.03.10-3.el7.x86_64.rpm  docker-ce-cli-19.03.10-3.el7.x86_64.rpm  docker-compose
    [shutang@www.cona001.com docker]$
     
    [shutang@www.cona001.com docker]$ sudo yum -y install ./containerd.io-1.2.6-3.3.el7.x86_64.rpm
    [shutang@www.cona001.com docker]$ sudo yum -y install ./container-selinux-2.119.2-1.911c772.el7_8.noarch.rpm
    [shutang@www.cona001.com docker]$ sudo yum -y install ./docker-ce-cli-19.03.10-3.el7.x86_64.rpm
    [shutang@www.cona001.com docker]$ sudo yum -y install ./docker-ce-19.03.10-3.el7.x86_64.rpm
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    Create daemon.json file in /etc/docker directory and write some configuration in the file

    {
        "exec-opts": ["native.cgroupdriver=systemd"],
        "data-root": "/x/home/docker",
        "storage-driver": "overlay2"
    }
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    Reload configuration and start docker service and set auto-enable docker service in reboot system.

    [shutang@www.cona001.com docker]$ sudo systemctl daemon-reload
    [shutang@www.cona001.com docker]$ sudo systemctl start docker && systemctl enable docker
    
    • 1
    • 2

    2.2 Install kubeadm、 kubectl、 kbuelet

    [shutang@www.cona001.com rpm]$ pwd
    /x/home/shutang/k8s/rpm
    [shutang@www.cona001.com rpm]$ ls
    cri-tools-1.19.0-0.x86_64.rpm  kubeadm-1.18.20-0.x86_64.rpm  kubectl-1.18.20-0.x86_64.rpm  kubelet-1.18.20-0.x86_64.rpm  kubernetes-cni-0.8.7-0.x86_64.rpm
    [shutang@www.cona001.com rpm]$ yum install -y ./cri-tools-1.19.0-0.x86_64.rpm
    [shutang@www.cona001.com rpm]$ yum install -y ./kubectl-1.18.20-0.x86_64.rpm
    [shutang@www.cona001.com rpm]$ yum install -y ./kubernetes-cni-0.8.7-0.x86_64.rpm ./kubelet-1.18.20-0.x86_64.rpm
    [shutang@www.cona001.com rpm]$ yum install -y ./kubeadm-1.18.20-0.x86_64.rpm
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    3 Update Apiserver SANs information

    Since we need to change single master cluster to multi-master cluster, we have to deploy a load-balancer to proxy APIServer service. the certs that before we have installed single cluster have not the load-balancer service address, so we need to update the certs and added some address to SAN list.

    3.1 update certs
    First we need to generate kubeadm.yml file, if the kubeadm init is use kubeadm.yml file, you don’t need to generation it again. the kubeadm.yml file generates coming from the comfigmap kubeadm-config in kube-system namespace.

    [22:43]:[shutang@www.cona001.com:~]$ kubectl -n kube-system get configmap kubeadm-config -o jsonpath='{.data.ClusterConfiguration}' > kubeadm.yaml
    
    • 1
    apiServer:
      extraArgs:
        authorization-mode: Node,RBAC
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controlPlaneEndpoint: 192.168.1.100:6443 
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      local:
        dataDir: /var/lib/etcd
    imageRepository: k8s.gcr.io
    kind: ClusterConfiguration
    kubernetesVersion: v1.18.20
    networking:
      dnsDomain: cluster.local
      podSubnet: 172.26.0.0/16
      serviceSubnet: 10.96.0.0/12
    scheduler: {}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    Modified after

    apiServer:
      certSANs:
      - apiserver-lb.com
      - www.kana001.com
      - www.kana002.com
      - www.kana003.com
      - www.cona001.com
      - www.cona002.com
      - 192.168.1.100
      - 192.168.1.101
      - 192.168.1.102
      - 192.172.2.10
      - 192.172.2.11
      - 192.168.1.250
      extraArgs:
        authorization-mode: Node,RBAC
      timeoutForControlPlane: 4m0s
    apiVersion: kubeadm.k8s.io/v1beta2
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controlPlaneEndpoint: apiserver-lb:16443  # 修改成负载均衡的地址
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      local:
        dataDir: /var/lib/etcd
    imageRepository: k8s.gcr.io
    kind: ClusterConfiguration
    kubernetesVersion: v1.18.20
    networking:
      dnsDomain: cluster.local
      podSubnet: 172.26.0.0/16
      serviceSubnet: 10.96.0.0/12
    scheduler: {}
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36

    生成的kubeadm.yaml 文件中并没有列出额外的 SANs(Subject Alternate Names) 信息,we need to add some new domains information,需要在 apiserver 属性下面添加一个 certsSANs 的列表。如果你在启动集群的时候就使用的 kubeadm 的配置文件,可能就已经包含 certsSANs 列表了,如果没有我们就需要添加它。比如上面我们就添加 vip 的域名及多个主机的域名和ip地址。其中 apiserver-lb.com 是虚拟vip地址 192.168.1.250 的对应的域名。

    更新完 kubeadm 配置文件后我们就可以更新证书了,首先我们移动现有的 APIServer 的证书和密钥,因为 kubeadm 检测到他们已经存在于指定的位置,它就不会创建新的了。

    [22:49]:[shutang@www.kana001.com:~]$ mv /etc/kubernetes/pki/apiserver.{crt,key} .
     
    # 生产新的证书
    [22:49]:[shutang@www.kana001.com:~]$ kubeadm init phase certs apiserver --config kubeadm.yaml
    [certs] Generating "apiserver" certificate and key
    [certs] apiserver serving cert is signed for DNS:www.kana001.com, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:apiserver-lb, DNS:apiserver-lb.com, DNS:www.kana001.com, DNS:www.kana002.com, DNS:www.kana003.com, DNS:www.cona001.com, DNS:www.cona002.com, IP Address:10.96.0.1, IP Address:192.168.1.100, IP Address:192.168.1.100, IP Address:192.168.1.101, IP Address:192.168.1.102, IP Address:192.172.2.10, IP Address:192.172.2.11, IP Address:192.168.1.250
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    通过上面的命令可以查看到 APIServer 签名的 DNS 和 IP 地址信息,一定要和自己的目标签名信息进行对比,如果缺失了数据就需要在上面的 certSANs 中补齐,重新生成证书。

    该命令会使用上面指定的 kubeadm 配置文件为 APIServer 生成一个新的证书和密钥,由于指定的配置文件中包含了 certSANs 列表,那么 kubeadm 会在创建新证书的时候自动添加这些 SANs。

    最后一步是重启 APIServer 来接收新的证书,最简单的方法是直接杀死 APIServer 的容器:

    [22:49]:[www.kana001.com:~]$ docker restart `docker ps | grep kube-apiserver | grep -v pause | awk '{print $1}'`
    
    
    • 1
    • 2

    3.2 verify SANs
    要验证证书是否更新我们可以直接去编辑 kubeconfig 文件中的 APIServer 地址,将其更换为新添加的 IP 地址或者主机名,然后去使用 kubectl 操作集群,查看是否可以正常工作。

    当然我们可以使用 openssl 命令去查看生成的证书信息是否包含我们新添加的 SAN 列表数据:

    [22:59]:[shutang@www.kana001.com:pki]$ openssl x509 -in /etc/kubernetes/pki/apiserver.crt -text
    Certificate:
        Data:
            Version: 3 (0x2)
            Serial Number: 5360349041943399932 (0x4a63c90da5244dfc)
        Signature Algorithm: sha256WithRSAEncryption
            Issuer: CN=kubernetes
            Validity
                Not Before: Mar 22 00:21:26 2022 GMT
                Not After : Jun  1 14:48:13 2032 GMT
            Subject: CN=kube-apiserver
            Subject Public Key Info:
                Public Key Algorithm: rsaEncryption
                    Public-Key: (2048 bit)
            .......
            X509v3 extensions:
                X509v3 Key Usage: critical
                    Digital Signature, Key Encipherment
                X509v3 Extended Key Usage:
                    TLS Web Server Authentication
                X509v3 Subject Alternative Name:
                    DNS:www.kana001.com, DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster.local, DNS:apiserver-lb, DNS:apiserver-lb.com, DNS:www.kana001.com, DNS:www.kana002.com, DNS:www.kana003.com, DNS:www.cona001.com, DNS:www.cona002.com, IP Address:10.96.0.1, IP Address:192.168.1.100, IP Address:192.168.1.100, IP Address:192.168.1.101, IP Address:192.168.1.102, IP Address:192.172.2.10, IP Address:192.172.2.11, IP Address:192.168.1.250
        Signature Algorithm: sha256WithRSAEncryption
          .....
    
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26

    如果上面的操作都一切顺利,最后一步是将上面的集群配置信息保存到集群的 kubeadm-config 这个 ConfigMap 中去,这一点非常重要,这样以后当我们使用 kubeadm 来操作集群的时候,相关的数据不会丢失,比如升级的时候还是会带上 certSANs 中的数据进行签名的。

    [22:59]:[shutang@www.kana001.com:pki]$ kubeadm init phase upload-config kubeadm --config kubeadm.yaml
    [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
     
    # 如果上面命令报错,可以直接编辑修改 添加需要的内容即可
    [22:59]:[shutang@www.kana001.com:pki]$ kubectl -n kube-system edit configmap kubeadm-config
    
    • 1
    • 2
    • 3
    • 4
    • 5

    4 Update configuration

    证书更新完成了,负载均衡也部署好了,接下来就需要把所有用到旧地址的组件配置修改成负载均衡的地址

    [23:05]:[shutang@www.kana001.com:pki]$ vi /etc/kubernetes/kubelet.conf
    ...
        server: https://apiserver-lb.com:16443
      name: kubernetes
    ...
     
    [22:59]:[shutang@www.kana001.com:pki]$ systemctl restart kubelet
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    [23:06]:[shutang@www.kana001.com:pki]$ vi /etc/kubernetes/controller-manager.conf
    ...
        server: https://apiserver-lb.com:16443
      name: kubernetes
    ...
     
    [23:06]:[shutang@www.kana001.com:pki]$ kubectl delete pod -n kube-system kube-controller-manager-xxx kube-controller-manager-xxx
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    [23:07]:[shutang@www.kana001.com:pki]$ vi /etc/kubernetes/scheduler.conf
    ...
        server: https://apiserver-lb.com:16443
      name: kubernetes
    ...
     
    [23:07]:[shutang@www.kana001.com:pki]$ kubectl delete pod -n kube-system kube-scheduler-xxx kube-scheduler-xxx
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    kubectl edit configmap kube-proxy -n kube-system
    ...
      kubeconfig.conf: |-
        apiVersion: v1
        kind: Config
        clusters:
        - cluster:
            certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            server: https://aipserver-lb.com:16443
          name: default
        contexts:
        - context:
            cluster: default
            namespace: default
            user: default
          name: default
    ...
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    重启kube-proxy

    kubectl rollout restart daemonset kube-proxy -n kube-system
    
    • 1

    kubeconfig上面的地址也需要改,比如 ~/.kube/config 和 /etc/kubernetes/admin.conf

    ...
        server: https://apiserver-lb.com:16443
      name: kubernetes
    ...
    
    • 1
    • 2
    • 3
    • 4

    5 Added master control plane

    5.1 创建新的token

    [07:26]:[shutang@www.kana001.com:~]$ kubeadm token create --print-join-command
    W0606 08:12:12.274689    1221 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
    kubeadm join apiserver-lb.com:16443 --token viaiuf.y92wu913wlh1hio5     --discovery-token-ca-cert-hash sha256:9febef3dbbc870485b2fec5ff2880bf6c91bd0724176eb421d097e7e1d341d31
    
    • 1
    • 2
    • 3

    5.2 生成新的 certificate key

    [08:13]:[root@www.kana001.com:~]# kubeadm init phase upload-certs --upload-certs
    W0606 08:13:36.701272    3344 version.go:103] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: dial tcp 34.107.204.206:443: connect: connection refused
    W0606 08:13:36.701386    3344 version.go:104] falling back to the local client version: v1.18.20
    W0606 08:13:36.701624    3344 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
    [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
    [upload-certs] Using certificate key:
    8ae4c991d696554939da87115338a870a8fc2e0bf4821d6d2cd466724c288314
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    5.2 在 www.cona001.com 和 www.cona002.com 机器上都执行

    [shutang@www.cona001.com node]$ sudo  kubeadm join apiserver-lb.com:16443 --token viaiuf.y92wu913wlh1hio5     --discovery-token-ca-cert-hash sha256:9febef3dbbc870485b2fec5ff2880bf6c91bd0724176eb421d097e7e1d341d31 --control-plane --certificate-key 8ae4c991d696554939da87115338a870a8fc2e0bf4821d6d2cd466724c288314
    [sudo] password for shutang:
    [preflight] Running pre-flight checks
    [preflight] Reading configuration from the cluster...
    [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
    [preflight] Running pre-flight checks before initializing the new control plane instance
    [preflight] Pulling images required for setting up a Kubernetes cluster
    [preflight] This might take a minute or two, depending on the speed of your internet connection
    [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
    [certs] Using certificateDir folder "/etc/kubernetes/pki"
    [certs] Generating "apiserver" certificate and key
    [certs] apiserver serving cert is signed for DNS names [www.cona001.com kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local apiserver-lb.com apiserver-lb.com www.kana001.com www.kana002.com www.kana001.com www.cona001.com www.cona002.com] and IPs [10.96.0.1 192.172.2.10 192.172.2.11 192.168.1.100 192.168.1.101 192.168.1.102 192.168.1.100 192.168.1.250]
    [certs] Generating "apiserver-kubelet-client" certificate and key
    [certs] Generating "etcd/server" certificate and key
    [certs] etcd/server serving cert is signed for DNS names [www.cona001.com localhost] and IPs [10.47.207.181 127.0.0.1 ::1]
    [certs] Generating "etcd/peer" certificate and key
    [certs] etcd/peer serving cert is signed for DNS names [www.cona001.com localhost] and IPs [10.47.207.181 127.0.0.1 ::1]
    [certs] Generating "etcd/healthcheck-client" certificate and key
    [certs] Generating "apiserver-etcd-client" certificate and key
    [certs] Generating "front-proxy-client" certificate and key
    [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
    [certs] Using the existing "sa" key
    [kubeconfig] Generating kubeconfig files
    [kubeconfig] Using kubeconfig folder "/etc/kubernetes"
    [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
    [kubeconfig] Writing "admin.conf" kubeconfig file
    [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
    [kubeconfig] Writing "controller-manager.conf" kubeconfig file
    [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
    [kubeconfig] Writing "scheduler.conf" kubeconfig file
    [control-plane] Using manifest folder "/etc/kubernetes/manifests"
    [control-plane] Creating static Pod manifest for "kube-apiserver"
    W0606 22:17:08.597842  175406 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
    [control-plane] Creating static Pod manifest for "kube-controller-manager"
    W0606 22:17:08.603725  175406 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
    [control-plane] Creating static Pod manifest for "kube-scheduler"
    W0606 22:17:08.604635  175406 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
    [check-etcd] Checking that the etcd cluster is healthy
    [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
    [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    [kubelet-start] Starting the kubelet
    [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
    [etcd] Announced new etcd member joining to the existing etcd cluster
    [etcd] Creating static Pod manifest for "etcd"
    [etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
    {"level":"warn","ts":"2022-06-06T22:17:23.185-0700","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://10.47.207.181:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
    [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
    [mark-control-plane] Marking the node www.cona001.com as control-plane by adding the label "node-role.kubernetes.io/master=''"
    [mark-control-plane] Marking the node www.cona001.com as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
     
    This node has joined the cluster and a new control plane instance was created:
     
    * Certificate signing request was sent to apiserver and approval was received.
    * The Kubelet was informed of the new secure connection details.
    * Control plane (master) label and taint were applied to the new node.
    * The Kubernetes control plane instances scaled up.
    * A new etcd member was added to the local/stacked etcd cluster.
     
    To start administering your cluster from this node, you need to run the following as a regular user:
     
        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config
     
    Run 'kubectl get nodes' to see this node join the cluster.
     
    [shutang@www.cona001.com node]$ mkdir -p $HOME/.kube
    [shutang@www.cona001.com node]$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    [shutang@www.cona001.com node]$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72

    5.3 查看集群

    [23:25]:[shutang@www.kana001.com:~]$ kubectl get nodes
    NAME                              STATUS   ROLES    AGE   VERSION
    www.cona001.com                   Ready    master   15h   v1.18.20
    www.cona002.com                  Ready    master   64m   v1.18.20
    www.kana001.com   Ready    master   77d   v1.18.20
    www.kana002.com   Ready    <none>   77d   v1.18.20
    www.kana003.com   Ready    <none>   60d   v1.18.20
    [23:25]:[shutang@www.kana001.com:~]$ kubectl get nodes -o wide
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    6 Install keepalived and haproxy

    6.1 在 www.kana001.com、www.cona001.com 、www.cona002.com 三台机器上安装部署 keepalived and haproxy

    www.kana001.com - keepalived

    [23:28]:[shutang@www.kana001.com:keepalived]$ pwd
    /etc/keepalived
    [23:25]:[shutang@www.kana001.com:~]$ yum install -y keepalived haproxy
    [23:28]:[shutang@www.kana001.com:keepalived]$ cat keepalived.conf
    ! Configuration File for keepalived
    global_defs {
       router_id www.kana001.com
    }
     
    # 定义脚本
    vrrp_script check_apiserver {
        script "/etc/keepalived/check_apiserver.sh"
        interval 2
        weight -5
        fall 3
        rise 2
    }
     
    vrrp_instance VI_1 {
        state MASTER                       # 重点关注
        interface eno49
        virtual_router_id 50
        priority 100                       # 重点关注
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            192.168.1.250                  # 虚拟VIP
        }
     
        # 调用脚本
        track_script {
            check_apiserver
        }
    }
     
     
    [23:28]:[shutang@www.kana001.com:keepalived]$ cat check_apiserver.sh
    #!/bin/bash
     
    function check_apiserver(){
      for ((i=0;i<5;i++))
      do
        apiserver_job_id=${pgrep kube-apiserver}
        if [[ ! -z ${apiserver_job_id} ]];then
          return
        else
          sleep 2
        fi
      done
      apiserver_job_id=0
    }
     
    # 1->running    0->stopped
    check_apiserver
    if [[ $apiserver_job_id -eq 0 ]];then
      /usr/bin/systemctl stop keepalived
      exit 1
    else
      exit 0
    fi
     
    # 启动 keepalived 服务
    [21:58]:[root@www.kana001.com:keepalived]# systemctl start keepalived
     
    # 查看 keepalived 的服务状态
    [21:58]:[root@www.kana001.com:keepalived]# systemctl status keepalived
    ● keepalived.service - LVS and VRRP High Availability Monitor
       Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
       Active: active (running) since Mon 2022-06-06 23:58:55 MST; 10s ago
      Process: 11658 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
     Main PID: 11659 (keepalived)
        Tasks: 3
       Memory: 1.4M
       CGroup: /system.slice/keepalived.service
               ├─11659 /usr/sbin/keepalived -D
               ├─11660 /usr/sbin/keepalived -D
               └─11661 /usr/sbin/keepalived -D
     
    Jun 06 23:58:57 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:58:57 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:58:57 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:58:57 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:59:02 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:59:02 www.kana001.com Keepalived_vrrp[11661]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eno49 for 192.168.1.250
    Jun 06 23:59:02 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:59:02 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:59:02 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    Jun 06 23:59:02 www.kana001.com Keepalived_vrrp[11661]: Sending gratuitous ARP on eno49 for 192.168.1.250
    [23:59]:[root@www.kana001.com:keepalived]#
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92

    6.2 www.cona001.com 和 www.cona002.com 部署 keepalived 和 haproxy 略。

    7 NOTES

    7.1 实际操作中在添加新 master 的过程直接卡住了

    7.2 在 www.kana001.com 上通过kubectl 操作 svc、pod、namespace 资源时发现,创建和删除pod不能正常操作,一直处于 pending 状态

    解决上面两个问题,首先检查每个节点上的 kubelet 服务中的 apiserver 地址是否是 apiserver-lb.com:16443

    [23:47]:[root@www.kana002.com:kubernetes]# cat kubelet.conf
    apiVersion: v1
    clusters:
    - cluster:
        certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1ETXlNakF3TWpFeU5sb1hEVE15TURNeE9UQXdNakV5Tmxvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBS3RkClRJYTBFSkRNMnlIVFpuY3JlUW04VytCWnprZTNJY0wrdnJVMTRPWTNWKzZTT212eHhYRVkyRkd4dDdlNW1ibkUKQXhOckN1ODJhQTZzNXFsenZrVUw4K3ovZU82U3paZHdQVXNZRGZUdnlxVFhaTGxKcysxNGE3THAwZzYrelEzNQpKeHpISVNES2hEYiswaE1xMkFKamNrMDNNN3VqTzh0Mk9IczRoSm0wMFVXYnNsa2NlNGNSd1lEeFhXQXVWT3VtClZwTzNtY05UM2tYdjc3eEhIVHJSbWQ3NSszc2UyWldxTVFONFRkSDVpOWw4bnZQRFFsdGJJNjZ2TjlaS3lwZjAKd0xRaFc2cjY4R3MxdFYwcWI4VkhmaDZNQUxnQVRqQWRNRUM3Skt4WVM3Rld5eGxaMkpVeWdGNkxDRXREQ3d5NQpsVkR5VzF3NndjejRyU0tDRzZFQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFEUGg3WU8yd1ZsTEJTdVVjME9lTEtqN1ZyaVkKajNjVk81MnRRSlpNNzZzQUk4bTNlbVBrNS9vUGVDQ2NoTGpBMzk3Nmk4M2lYMHlNb1JSbVB5ZnA5bmRldnNnZgpNL2M0NXMzcG1QajNSMlQ1SHRSbzhWR3ZQekZFOHVCeVpPUUVFTWNrSUc3QVlpelJLQkNORE1Hb05wdEFiTmhmCkZvSmtGbGNUdWtTWlZNOHFNMHdCSk9reG02TEJ0YVJObzc2a0ZORGoyTWJOMXBKc2pjY0k5T0g2SHBtenVVSXcKMjRUNmxBSFBYczFFNnFZWW1uWEJUYjRzcmNuNnNkT0VzaDQvTmpldjNCRWdHYUZLN3ZFUGVoNUdtaGZYRTdnRApMeGpWWVNxSDhMRTVjM0cwb01tOStqendaK2tVbW5SWlgzUDh1dENGTC9KeGNvOXNNNzJLWWhxM0g3ST0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
        server: https://apiserver-lb.com:16443
      name: default-cluster
    contexts:
    - context:
        cluster: default-cluster
        namespace: default
        user: default-auth
      name: default-context
    current-context: default-context
    kind: Config
    preferences: {}
    users:
    - name: default-auth
      user:
        client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
        client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    然后重启 www.kana001.com、 www.kana002.com、 www.kana003.com上的 kubelet 服务

    systemctl restart kubelet
    
    
    • 1
    • 2

    7.3 当执行添加新的 master 初始化的时候,出现了 Calico-node-xxx Pod 一直处于 CrashLoopBackOff

    状态

    [08:28]:[root@www.kana001.com:~]# kubectl edit daemonset calico-node -n kube-system
    Edit cancelled, no changes made.
    [08:31]:[root@www.kana001.com:~]# kubectl logs calico-node-m2bgm -n kube-system
    2022-06-06 15:27:25.896 [INFO][8] startup.go 256: Early log level set to info
    2022-06-06 15:27:25.896 [INFO][8] startup.go 272: Using NODENAME environment for node name
    2022-06-06 15:27:25.896 [INFO][8] startup.go 284: Determined node name: ccg22dtapaz0858
    2022-06-06 15:27:25.902 [INFO][8] k8s.go 228: Using Calico IPAM
    2022-06-06 15:27:25.902 [INFO][8] startup.go 316: Checking datastore connection
    2022-06-06 15:27:25.934 [INFO][8] startup.go 340: Datastore connection verified
    2022-06-06 15:27:25.934 [INFO][8] startup.go 95: Datastore is ready
    2022-06-06 15:27:25.991 [INFO][8] startup.go 382: Initialize BGP data
    2022-06-06 15:27:25.993 [WARNING][8] startup.go 594: Unable to auto-detect an IPv4 address using interface regexes [eno49]: no valid host interfaces found
    2022-06-06 15:27:25.993 [WARNING][8] startup.go 404: Couldn't autodetect an IPv4 address. If auto-detecting, choose a different autodetection method. Otherwise provide an explicit address.
    2022-06-06 15:27:25.994 [INFO][8] startup.go 210: Clearing out-of-date IPv4 address from this node IP=""
    2022-06-06 15:27:26.049 [WARNING][8] startup.go 1057: Terminating
    Calico node failed to start
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    解决办法:编辑 daemonset类型的 calico-node

    - name: CLUSTER_TYPE
              value: k8s,bgp
            - name: IP_AUTODETECTION_METHOD
              value: interface=eno49|eth0            # 由于 www.kana001.com 与 新增加的机器 www.cona001.com\www.cona002.com 的默认网卡名不同,所以需要添加新机器的网卡名,这里可以使用通配符来做匹配
            - name: IP
              value: autodetect
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
  • 相关阅读:
    二、python Django路由与请求(request的参数)
    Linux下编译SQLite3源码
    Java枚举
    小熊派-FreeRTOS配置任务
    ES6中的Promise
    ssm及springboot整合shiro
    ROS安装
    mysql多表记录的操作
    Redis高级应用
    读写稳定高速的国产固态U盘,办公学习好工具,ORICO快闪U盘上手
  • 原文地址:https://blog.csdn.net/weixin_48505120/article/details/127800711