• 生产环境部署高可用 Kubernetes 集群


    1 k8s 高可用集群架构

    下面是 kubernetes 官网的集群架构图
    在这里插入图片描述

    2 部署 k8s 集群

    2.1 集群规划

    hostnameipcomponentscluster roleflannel versionkubectl versionkubeadm versionkeepalivedOSdocker versiondocker-root-datacgroup driver
    www.datang001.com10.176.10.20kube-apiservermaster01v1.18.20v1.18.20v1.18.20Centos7.920.10.2/var/lib/dockersystemd
    www.datang002.com10.176.10.21kube-apiservermaster02v1.18.20v1.18.20v1.18.20Centos7.920.10.2/var/lib/dockersystemd
    www.datang003.com10.176.10.22kube-apiservermaster03v1.18.20v1.18.20v1.18.20Centos7.920.10.2/var/lib/dockersystemd
    www.datang004.com10.176.10.23kubelet/kube-proxynode01v1.18.20v1.18.20v1.18.20Centos7.920.10.2/var/lib/dockersystemd
    www.datang005.com10.176.10.24kubelet/kube-proxynode02v1.18.20v1.18.20v1.18.20Centos7.920.10.2/var/lib/dockersystemd
    www.datang006.com10.176.10.25kubelet/kube-proxynode03v1.18.20v1.18.20v1.18.20Centos7.920.10.2/var/lib/dockersystemd
    apiserver-lb.com10.176.10.250VIP

    2.2 高可用架构

    The kubeadm method is used to build a high-availability k8s cluster. The high availability of the k8s cluster is actually the high availability of the core components of k8s. This deployment adopts the active-standby mode. The architecture is as follows:
    在这里插入图片描述

    Description of the high-availability architecture in active-standby mode:

    core componentshigh availablity modehigh availablity implement method
    apiservermaster-backupkeepalived-+haproxy
    controller-managermaster-backupleader election
    schedulermaster-backupleader election
    etcdclusterkubeadm
    • apiserver: High availability through keepalived, triggering keepalived vip transfer when a node fails
    • controller-manager: k8s generates a leader by election (controlled by --leader-elect, the default is true), and only one controller-manager component runs in the cluster at the same time;
    • scheduler: k8s generates a leader by election (controlled by --leader-elect, the default is true), and only one scheduler component runs in the cluster at the same time;
    • etcd:The cluster is automatically created by running kubeadm to achieve high availability. The number of deployed nodes is odd, and the 3-node mode tolerates at most one machine downtime.

    2.3 开始部署

    2.3.1 在每台机器 /etc/hosts 文件添加 IP 和主机名的映射

    Execute the commad in all the control plan and work node hosts’s /etc/hosts file

    cat >> /etc/hosts <<EOF
     
    10.176.10.20 www.datang001.com
    10.176.10.21 www.datang001.com
    10.176.10.22 www.datang001.com
     
    10.176.10.23 www.datang001.com
    10.176.10.24 www.datang001.com
    10.176.10.25 www.datang001.com
     
    10.176.10.250 apiserver-lb.com
    EOF
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    2.3.2 禁用 swap 分区、关闭 firewalld、禁用SeLinux

    临时禁用 swap 分区,机器重启后失效,所有机器都执行

    [root@www.datang001.com ~]# swapoff -a
    [root@www.datang001.com ~]# free -m
                  total        used        free      shared  buff/cache   available
    Mem:         128770        4073      120315         330        4381      123763
    Swap:             0           0           0
    
    • 1
    • 2
    • 3
    • 4
    • 5

    永久禁用 swap 分区,所有机器都执行

    [root@www.datang001.com ~]# cat /etc/fstab
     
    #
    # /etc/fstab
    # Created by anaconda on Thu Oct  5 04:55:59 2017
    #
    # Accessible filesystems, by reference, are maintained under '/dev/disk'
    # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
    #
    ...
    ...
    #/dev/mapper/rootvg-swap swap                    swap    defaults        0 0
    ...
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    禁用 seLinux
    首先查看 selinux 状态

    [root@www.datang001.com ~]# sestatus -v
    SELinux status:                 disabled
    
    • 1
    • 2

    临时禁用

    setenforce 0
    
    • 1

    永久禁用

    sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
    
    • 1
    2.3.3 升级内核和启用某些内核模块

    为了集群的稳定性和防止后面业务容器把节点内存耗尽问题,生产环境必须升级linux服务器内核到4.19之上。由于centos7默认的内核版本是3.10.x,实际运行中,可能会出现内存泄露的问题,根因是 cgroup 的 keme account 特性有 内存泄露问题,具体分析请移步这里低内核造成k8s内存泄露,所以部署k8s集群之前,一定先对所有机器的内核版本进行升级。[已在使用的生产环境不要立刻做升级操作,因为已经有业务在kuberentes上运行了,升级的内核会导致业务容器飘逸,严重情况可能造成业务容器不能正常运行。]
    内核升级过程:

    • 1

    转发 IPv4 并让 iptables 看到桥接流量

    通过运行 lsmod | grep br_netfilter 来验证 br_netfilter 模块是否已加载。

    若要显式加载此模块,请运行 sudo modprobe br_netfilter。

    为了让 Linux 节点的 iptables 能够正确查看桥接流量,请确认 sysctl 配置中的 net.bridge.bridge-nf-call-iptables 设置为 1。例如:

    cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
    overlay
    br_netfilter
    EOF
    
    sudo modprobe overlay
    sudo modprobe br_netfilter
    
    # 设置所需的 sysctl 参数,参数在重新启动后保持不变
    cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-iptables  = 1
    net.bridge.bridge-nf-call-ip6tables = 1
    net.ipv4.ip_forward                 = 1
    EOF
    
    # 应用 sysctl 参数而不重新启动
    sudo sysctl --system
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    2.3.4 设置 kubernetes repo

    国内的一般设置为阿里云的源,如果服务器可以科学上网,那么直接使用默认谷歌源
    不能科学上网,使用阿里镜像源

    cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    [kubernetes]
    name=Kubernetes
    baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
    EOF
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    能科学上网,使用谷歌镜像源

    cat <<EOF > /etc/yum.repos.d/kubernetes.repo
    [kubernetes]
    name=Kubernetes
    baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
    enabled=1
    gpgcheck=1
    repo_gpgcheck=1
    gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
            https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
    EOF
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • [] 中括号中的是repository id,唯一,用来标识不同仓库
    • name 仓库名称,自定义
    • baseurl 仓库地址
    • enable 是否启用该仓库,默认为1表示启用
    • gpgcheck 是否验证从该仓库获得程序包的合法性,1为验证
    • repo_gpgcheck 是否验证元数据的合法性 元数据就是程序包列表,1为验证
    • gpgkey=URL 数字签名的公钥文件所在位置,如果gpgcheck值为1,此处就需要指定gpgkey文件的位置,如果gpgcheck值为0就不需要此项了
    2.3.4 在线安装 docker 和 kubeadm、kubelet、和 kubectl

    首先升级 yum cache

    yum clean all 
    yum update
    yum -y makecahe
    
    • 1
    • 2
    • 3

    先安装 docker,关于docker 的安装我们需要注意,docker 存储目录要爆炸尽量是大磁盘,因为后期我们需要拉取很多镜像文件呢,另外 Cgroup Driver 应该设置为 systemdStorage Driver 应该设置为 overlay2
    /etc/docker/daemon.json

    {
    	"exec-opts": ["native.cgroupdriver=systemd"]"data-root": "/data01/docker"
    }
    
    • 1
    • 2
    • 3
    • 4

    安装指定版本的docker

    yum list docker-ce --showduplicates | sort -r
    yum install -y docker-ce-19.03.9-3.el7
    systemctl start docker && systemctl enable docker
    
    • 1
    • 2
    • 3

    安装指定版本的 kubeadm、kubelet、kubectl

    yum -y install kubeadm==1.18.20 kubelet==1.18.20 kubectl==1.18.20
    
    • 1
    2.3.5 离线安装 dockerkubeadmkubelet、和 kubectl,以及提前下载镜像

    有时候我们机器处于内网中,不能正常连接互联网,这时我们就需要准备好 rpm 包来进行安装。这些 rpm 包是有相互依赖关系的,所以在安装的时候有先后顺序。kubectl 依赖于 crit-toolskubernetes-cnikubelet 之间相互依赖,kubeadm 依赖于 kubectlkubeletcrit-tools

    [root@www.datang001.com rpm]# pwd
    /home/shutang/k8s/rpm
    [root@www.datang001.com rpm]# ls
    cri-tools-1.19.0-0.x86_64.rpm  kubeadm-1.18.20-0.x86_64.rpm  kubectl-1.18.20-0.x86_64.rpm  kubelet-1.18.20-0.x86_64.rpm  kubernetes-cni-0.8.7-0.x86_64.rpm
    [root@www.datang001.com rpm]# yum -y install ./cri-tools-1.19.0-0.x86_64.rpm
    [root@www.datang001.com rpm]# yum -y install ./kubectl-1.18.20-0.x86_64.rpm
    [root@www.datang001.com rpm]# yum -y install ./kubernetes-cni-0.8.7-0.x86_64.rpm ./kubelet-1.18.20-0.x86_64.rpm
    [root@www.datang001.com rpm]# yum -y install ./kubeadm-1.18.20-0.x86_64.rpm
    
    [root@www.datang001.com rpm]# whereis kubeadm
    kubeadm: /usr/bin/kubeadm
    [root@www.datang001.com rpm]# whereis kubelet
    kubelet: /usr/bin/kubelet
    [root@www.datang001.com rpm]# whereis kubectl
    kubectl: /usr/bin/kubectl
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    查看我们需要提前下载的镜像

    [root@phx11-gliws-u23 ~]# kubeadm config images list --kubernetes-version v1.18.20
    W1112 20:10:37.628119   20654 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
    k8s.gcr.io/kube-apiserver:v1.18.20
    k8s.gcr.io/kube-controller-manager:v1.18.20
    k8s.gcr.io/kube-scheduler:v1.18.20
    k8s.gcr.io/kube-proxy:v1.18.20
    k8s.gcr.io/pause:3.2
    k8s.gcr.io/etcd:3.4.3-0
    k8s.gcr.io/coredns:1.6.7
    
    # 注意:master节点需要把上面的镜像都下载,node节点只需要下载 k8s.gcr.io/kube-proxy:v1.18.20 k8s.gcr.io/pause:3.2 k8s.gcr.io/coredns:1.6.7
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11

    如果我们不能访问 k8s.gcr.io,我们需要利用阿里云提供的镜像仓库里的镜像。只不过有时候阿里云镜像仓库保存的版本与 k8s.gcr.io 不同步而已,如果需要安装比较新的版本,阿里云镜像仓库不存在的话,就可以用 daocloud 提供的镜像仓库或 清华镜像仓库,或者试试国内别的镜像仓库。

    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.20
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.20
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.20
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.20
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.7
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    2.3.6 设置 kubelet

    默认配置的 pause 镜像使用 k8s.gcr.io 仓库,国内可能无法访问,所以这里配置 kubelet 使用阿里云的 pause 镜像地址。

    DOCKER_CGROUPS=$(docker info |grep 'Cgroup' |cut -d '' -f4)
    cat > /etc/sysconfig/kubelet <EOF
    KUBELET_EXTRA_ARGS="--cgroup-dirver=$DOCKER_CGROUPS --pod-infrs-contaienr-image=registry.cn-hangzhou.aliyuncs.com/google_containers/puase-amd64:3.2"
    EOF
    
    • 1
    • 2
    • 3
    • 4

    设置 kubelet 开机自启动

    systemctl daemon-reload
    systemctl enable --now kubelet
    
    • 1
    • 2
    2.3.7 集群初始化

    https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/
    https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-config/

    master01 节点的 kubeadm-config.yaml 配置文件如下:

    apiVersion: kubeadm.k8s.io/v1beta2
    apiServer:
      certSANs:
      - apiserver-lb.com
      - www.datang001.com
      - www.datang002.com
      - www.datang003.com
      - www.datang004.com
      - www.datang005.com
      - www.datang006.com
      - 10.172.10.20
      - 10.172.10.21
      - 10.172.10.22
      - 10.172.10.23
      - 10.172.10.24
      - 10.172.10.25
      - 10.172.10.250
      extraArgs:
        authorization-mode: Node,RBAC
      timeoutForControlPlane: 4m0s
    certificatesDir: /etc/kubernetes/pki
    clusterName: kubernetes
    controlPlaneEndpoint: apiserver-lb:16443  # 修改成负载均衡的地址
    controllerManager: {}
    dns:
      type: CoreDNS
    etcd:
      local:
        dataDir: /var/lib/etcd
    imageRepository: k8s.gcr.io
    #imageRepository: registory.cn-hangzhou.aliyuncs.com/google_container
    #imageRepository: daocloud.io/daocloud
    kind: ClusterConfiguration
    kubernetesVersion: v1.18.20
    networking:
      dnsDomain: cluster.local
      podSubnet: 172.26.0.0/16
      serviceSubnet: 10.96.0.0/12
    scheduler: {}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39

    后面我们升级kubenetes 版本的时候,再次初始化,可能 kubeadm-config 文件里的一些 API 有所变更,需要重新根据 old 的配置文件生产新的 kubeadm-config 文件

    kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml
    
    • 1

    所有节点提前下载镜像,可以节省初始化时间

    kubeadm config image pull --config kubeadm-config.yaml
    
    • 1

    master01 节点初始化,初始化以后会在 /etc/kubernetes 目录下生成对应的证书和配置文件,--upload-certs 参数是当有节点加入集群时,自动同步 master01 上生成的证书到该节点上。:

    [root@www.datang001.com k8s]# sudo kubeadm init --config kubeadm.yaml --upload-certs
    W0514 23:06:11.417640   20494 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
    [init] Using Kubernetes version: v1.18.20
    [preflight] Running pre-flight checks
        [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.2. Latest validated version: 19.03
    [preflight] Pulling images required for setting up a Kubernetes cluster
    [preflight] This might take a minute or two, depending on the speed of your internet connection
    [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    [kubelet-start] Starting the kubelet
    [certs] Using certificateDir folder "/etc/kubernetes/pki"
    [certs] Generating "ca" certificate and key
    [certs] Generating "apiserver" certificate and key
    [certs] apiserver serving cert is signed for DNS names [www.datang001.com kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local apiserver-lb.com] and IPs [10.96.0.1 10.222.175.201]
    [certs] Generating "apiserver-kubelet-client" certificate and key
    [certs] Generating "front-proxy-ca" certificate and key
    [certs] Generating "front-proxy-client" certificate and key
    [certs] Generating "etcd/ca" certificate and key
    [certs] Generating "etcd/server" certificate and key
    [certs] etcd/server serving cert is signed for DNS names [phx11-gliws-u23 localhost] and IPs [10.172.10.20 127.0.0.1 ::1]
    [certs] Generating "etcd/peer" certificate and key
    [certs] etcd/peer serving cert is signed for DNS names [www.datang001.com localhost] and IPs [10.172.10.20 127.0.0.1 ::1]
    [certs] Generating "etcd/healthcheck-client" certificate and key
    [certs] Generating "apiserver-etcd-client" certificate and key
    [certs] Generating "sa" key and public key
    [kubeconfig] Using kubeconfig folder "/etc/kubernetes"
    [kubeconfig] Writing "admin.conf" kubeconfig file
    [kubeconfig] Writing "kubelet.conf" kubeconfig file
    [kubeconfig] Writing "controller-manager.conf" kubeconfig file
    [kubeconfig] Writing "scheduler.conf" kubeconfig file
    [control-plane] Using manifest folder "/etc/kubernetes/manifests"
    [control-plane] Creating static Pod manifest for "kube-apiserver"
    [control-plane] Creating static Pod manifest for "kube-controller-manager"
    W0514 23:06:16.003004   20494 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
    [control-plane] Creating static Pod manifest for "kube-scheduler"
    W0514 23:06:16.004606   20494 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
    [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
    [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
    [apiclient] All control plane components are healthy after 20.502817 seconds
    [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
    [kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
    [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
    [upload-certs] Using certificate key:
    **************************************************
    [mark-control-plane] Marking the node www.datang001.com as control-plane by adding the label "node-role.kubernetes.io/master=''"
    [mark-control-plane] Marking the node www.datang001.com as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
    [bootstrap-token] Using token: ixhv5g.n37m33eybijtb13q
    [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
    [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
    [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
    [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
    [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
    [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
    [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
    [addons] Applied essential addon: CoreDNS
    [addons] Applied essential addon: kube-proxy
     
    Your Kubernetes control-plane has initialized successfully!
     
    To start using your cluster, you need to run the following as a regular user:
     
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
     
    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      https://kubernetes.io/docs/concepts/cluster-administration/addons/
     
    You can now join any number of the control-plane node running the following command on each as root:
     
      kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \
        --discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e \
        --control-plane --certificate-key *************************************************
     
    Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
    As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
    "kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
     
    Then you can join any number of worker nodes by running the following on each as root:
     
    kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \
        --discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e
    [root@www.datang001.com k8s]# mkdir -p $HOME/.kube
    [root@www.datang001.com k8s]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    [root@www.datang001.com k8s]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87

    或者不用指定配置文件初始化:

    kubeadm init --control-plane-endpoint "LOAD_BALANCER_DNS:LOAD_BALANCER_PORT" --image-repository daocloud.io/daocloud --upload-certs
    
    • 1

    如果初始化失败,重置后再次初始化,命令如下:

    kubeadm reset
    
    • 1

    Token 过期处理:https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-token/#cmd-token-create

    2.3.8 把其他 master 节点和 node 节点加入集群

    master 节点加入集群

    kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \
        --discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e \
        --control-plane --certificate-key *************************************************
     
    
    • 1
    • 2
    • 3
    • 4

    node 节点加入集群

    kubeadm join apiserver-lb.com:6443 --token ixhv5g.n37m33eybijtb13q \
        --discovery-token-ca-cert-hash sha256:fc9a9ff3fc5ae118a5a9616cb742a26deacc235ec79beb85018b52280d887d5e
    
    • 1
    • 2
    2.3.9 在 master01 节点上安装 keepalivedhaproxy 软件
    yum -y install keepalived haproxy
    
    • 1

    master01 机器上的 keepalived.conf 配置文件 和 check_apiserver.sh 脚本文件。

    # keepalived.conf 内容
    ! Configuration File for keepalived
    global_defs {
       router_id www.datang001.com
    }
    
    # 定义脚本
    vrrp_script check_apiserver {
        script "/etc/keepalived/check_apiserver.sh" 
        interval 2                                  
        weight -5                                  
        fall 3                                   
        rise 2                               
    }
    
    vrrp_instance VI_1 {
        state MASTER 
        interface eth0
        virtual_router_id 50
        priority 100
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            10.176.10.250
        }
    
        # 调用脚本
        track_script {
            check_apiserver
        }
    }
    
    
    # 监测脚本 check_apiserver
    #!/bin/bash
    
    function check_apiserver(){
      for ((i=0;i<5;i++))
      do
        apiserver_job_id=${pgrep kube-apiserver}
        if [[ ! -z ${apiserver_job_id} ]];then
          return
        else
          sleep 2
        fi
      done
      apiserver_job_id=0
    }
    
    # 1->running    0->stopped
    check_apiserver
    if [[ $apiserver_job_id -eq 0 ]];then
      /usr/bin/systemctl stop keepalived
      exit 1
    else
      exit 0
    fi
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60

    启动 keepalived

    systemctl enable --now keepalived.service
    
    • 1

    master01haproxy 的配置文件 haproxy.cfg

    global
        log /dev/log  local0 warning
        chroot      /var/lib/haproxy
        pidfile     /var/run/haproxy.pid
        maxconn     4000
        user        haproxy
        group       haproxy
        daemon
    
       stats socket /var/lib/haproxy/stats
    
    defaults
      mode http
      log global
      option  httplog
      option  dontlognull
            timeout connect 5000
            timeout client 50000
            timeout server 50000
    
    listen status_page
        bind 0.0.0.0:1080
        stats enable
        stats uri /haproxy-status
        stats auth    admin:nihaoma
        stats realm "Welcome to the haproxy load balancer status page"
        stats hide-version
        stats admin if TRUE
        stats refresh 5s
    
    frontend kube-apiserver
      bind *:16443
      mode tcp
      option tcplog
      default_backend kube-apiserver
    
    backend kube-apiserver
        mode tcp
        option tcplog
        option tcp-check
        balance roundrobin
        default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
        server www.datang001.com           10.172.10.20:6443  check # Replace the IP address with your own.
        server www.datang002.com           10.172.10.21:6443 check 
        server www.datang003.com           10.172.110.22:6443  check 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45

    启动 haproxy

    systemctl enable --now haproxy
    
    • 1

    master02 机器上的 keepalived.conf 配置文件 和 check_apiserver.sh 脚本文件。

    ! Configuration File for keepalived
    global_defs {
       router_id www.datang002.com
    }
    
    # 定义脚本
    vrrp_script check_apiserver {
        script "/etc/keepalived/check_apiserver.sh"
        interval 2
        weight -5
        fall 3
        rise 2
    }
    
    vrrp_instance VI_1 {
        state BACKUP
        interface eth0
        virtual_router_id 50
        priority 99
        advert_int 1
        authentication {
            auth_type PASS
            auth_pass 1111
        }
        virtual_ipaddress {
            10.172.10.250
        }
    
        # 调用脚本
        #track_script {
        #    check_apiserver
        #}
    }
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33

    master02haproxy 的配置文件 haproxy.cfg

    global
        log /dev/log  local0 warning
        chroot      /var/lib/haproxy
        pidfile     /var/run/haproxy.pid
        maxconn     4000
        user        haproxy
        group       haproxy
        daemon
    
       stats socket /var/lib/haproxy/stats
    
    defaults
      mode http
      log global
      option  httplog
      option  dontlognull
            timeout connect 5000
            timeout client 50000
            timeout server 50000
    
    listen status_page
        bind 0.0.0.0:1080
        stats enable
        stats uri /haproxy-status
        stats auth    admin:nihaoma
        stats realm "Welcome to the haproxy load balancer status page"
        stats hide-version
        stats admin if TRUE
        stats refresh 5s
    
    frontend kube-apiserver
      bind *:16443
      mode tcp
      option tcplog
      default_backend kube-apiserver
    
    backend kube-apiserver
        mode tcp
        option tcplog
        option tcp-check
        balance roundrobin
        default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
        server www.datang001.com           10.172.10.20:6443  check # Replace the IP address with your own.
        server www.datang002.com           10.172.10.21:6443 check 
        server www.datang003.com           10.172.110.22:6443  check 
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45

    master03 机器上的 keepalived.conf 配置文件 和 check_apiserver.sh 脚本文件。
    master03haproxy 的配置文件 haproxy.cfg

    2.3.10 安装 flannel 网络
    [root@www.datang001.com network]$ wget https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
    [root@www.datang001.com network]$ kubectl apply -f kube-flannel.yml
    
    • 1
    • 2
    2.3.11 查看集群状态
    [shutang@www.datang001.com network]# kubectl get nodes -o wide
    NAME              STATUS   ROLES    AGE    VERSION    INTERNAL-IP      EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
    www.datang001.com   Ready    <none>   160m   v1.18.20   10.172.10.20    <none>        CentOS Linux 7 (Core)   3.10.0-1160.62.1.el7.x86_64   docker://20.10.2
    www.datang002.com   Ready    <none>   160m   v1.18.20   10.172.10.21   <none>        CentOS Linux 7 (Core)   3.10.0-1160.62.1.el7.x86_64   docker://20.10.2
    www.datang003.com   Ready    <none>   161m   v1.18.20   10.172.10.22   <none>        CentOS Linux 7 (Core)   3.10.0-1160.62.1.el7.x86_64   docker://20.10.2
    www.datang004.com   Ready    master   162m   v1.18.20   10.172.10.23   <none>        CentOS Linux 7 (Core)   3.10.0-1160.62.1.el7.x86_64   docker://20.10.2
    www.datang005.com   Ready    master   163m   v1.18.20   10.172.10.24   <none>        CentOS Linux 7 (Core)   3.10.0-1160.62.1.el7.x86_64   docker://20.10.6
    www.datang006.com   Ready    master   166m   v1.18.20   10.172.10.25   <none>        CentOS Linux 7 (Core)   3.10.0-1160.62.1.el7.x86_64   docker://20.10.2
    
    
    
    # execute kubectl get cs It may be that the first two components are unhealthy in reality, and there are ways to deal with it later in the article
    [root@www.datang001.com ~]# kubectl get cs
    NAME                 STATUS    MESSAGE             ERROR
    controller-manager   Healthy   ok
    scheduler            Healthy   ok
    etcd-0               Healthy   {"health":"true"}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    2.3.12 修改 k8s 默认的 NodePort 端口范围

    In a kubernetes cluster, the default range of NodePort is 30000-32767. In some cases, due to company network policy restrictions, you may modify the port range of NodePort.

    Modify kube-apiserver.yaml

    When using kubeadm to install a k8s cluster, there will be a file /etc/kubernetes/mainfests/kube-apiserver.yaml on your control plane nodes, modify this file and add --service-node-port-range=1 to it -65535 (please use your own desired port range) as follows:

    apiVersion: v1
    kind: Pod
    metadata:
      annotations:
        kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 10.172.10.20:6443
      creationTimestamp: null
      labels:
        component: kube-apiserver
        tier: control-plane
      name: kube-apiserver
      namespace: kube-system
    spec:
      containers:
      - command:
        - kube-apiserver
        - --advertise-address=10.172.10.20
        - --allow-privileged=true
        - --authorization-mode=Node,RBAC
        - --client-ca-file=/etc/kubernetes/pki/ca.crt
        - --enable-admission-plugins=NodeRestriction
        - --enable-bootstrap-token-auth=true
        - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
        - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
        - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
        - --etcd-servers=https://127.0.0.1:2379
        - --insecure-port=0
        - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
        - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
        - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
        - --requestheader-allowed-names=front-proxy-client
        - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
        - --requestheader-extra-headers-prefix=X-Remote-Extra-
        - --requestheader-group-headers=X-Remote-Group
        - --requestheader-username-headers=X-Remote-User
        - --secure-port=6443
        - --service-account-key-file=/etc/kubernetes/pki/sa.pub
        - --service-cluster-ip-range=10.96.0.0/12
        - --service-node-port-range=1-65535                        # 新增该行
        - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
        - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
        image: k8s.gcr.io/kube-apiserver:v1.18.20
        imagePullPolicy: IfNotPresent
        ......
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45

    重启 apiserver

    # get apiserver pod name
    export apiserver_pods=$(kubectl get pods --selector=component=kube-apiserver -n kube-system --output=jsonpath={.items..metadata.name})
    # delete apiserver pod
    kubectl delete pod $apiserver_pods -n kube-system
    
    • 1
    • 2
    • 3
    • 4

    监测 apiserver 是否正常

    kubectl describe pod $apiserver_pods -n kube-system
     
    # Check whether there is the line we added above in the parameters of the startup command, and if so, verify that it is correct
    
    • 1
    • 2
    • 3
    2.3.13 修改集群证书时间
    2.3.13.1 First of all, you need to determine the kubernetes version used in the installation of the k8s cluster and the go version when developing this version
    [root@www.datang001.com ~]# kubeadm version
    kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.20", GitCommit:"1f3e19b7beb1cc0110255668c4238ed63dadb7ad", GitTreeState:"clean", BuildDate:"2021-06-16T12:56:41Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
    
    • 1
    • 2
    2.3.13.2 Below we download the kubernetes1.18.20.tar.gz package and install the go1.13.15 tool
    [root@www.datang001.com update-cert]# wget https://github.com/kubernetes/kubernetes/archive/refs/tags/v1.18.20.tar.gz && wget https://golang.google.cn/dl/go1.13.15.linux-amd64.tar.gz
    [root@www.datang001.com update-cert]# tar -zxf v1.18.20.tar.gz && tar -zxf go1.13.15.linux-amd64.tar.gz -C /usr/local/
     
    [root@www.datang001.com update-cert]# cat > /etc/profile.d/go.sh <
    export PATH=$PATH:/usr/local/go/bin
    EOF
    [root@www.datang001.com update-cert]# source /etc/profile.d/go.sh
    [root@www.datang001.com update-cert]# go version
    go version go1.13.15 linux/amd64
    [root@www.datang001.com update-cert]#
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    2.3.13.3 Using k8s installed by kubeadm, all certificates are placed in the directory /etc/kubernetes/pki. We can check the time of each certificate and find that the certificate of the ca class is valid for 10 years, in addition to other components Certificates are valid for one year by default.
    [root@www.datang001.com ~]# kubeadm alpha certs check-expiration
    [check-expiration] Reading configuration from the cluster...
    [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
     
    CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
    admin.conf                 May 15, 2023 06:06 UTC   364d                                    no
    apiserver                  May 15, 2023 06:06 UTC   364d            ca                      no
    apiserver-etcd-client      May 15, 2023 06:06 UTC   364d            etcd-ca                 no
    apiserver-kubelet-client   May 15, 2023 06:06 UTC   364d            ca                      no
    controller-manager.conf    May 15, 2023 06:06 UTC   364d                                    no
    etcd-healthcheck-client    May 15, 2023 06:06 UTC   364d            etcd-ca                 no
    etcd-peer                  May 15, 2023 06:06 UTC   364d            etcd-ca                 no
    etcd-server                May 15, 2023 06:06 UTC   364d            etcd-ca                 no
    front-proxy-client         May 15, 2023 06:06 UTC   364d            front-proxy-ca          no
    scheduler.conf             May 15, 2023 06:06 UTC   364d                                    no
     
    CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
    ca                      May 12, 2032 06:06 UTC   9y              no
    etcd-ca                 May 12, 2032 06:06 UTC   9y              no
    front-proxy-ca          May 12, 2032 06:06 UTC   9y              no
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    2.3.13.4 Modify the corresponding constant source code related to the certificate time

    www.datang001.com modified certs validate time

    [root@www.datang001.com update-cert]# pwd
    /home/shutang/k8s/update-cert
    [root@www.datang001.com update-cert]# cd kubernetes-1.18.20
    [root@www.datang001.com kubernetes-1.18.20]# cd cmd/kubeadm/app/constants/
    [root@www.datang001.com constants]# cat constants |grep 10
    cat: constants: No such file or directory
    [root@www.datang001.com constants]# cat constants.go |grep 10
        CertificateValidity = time.Hour * 24 * 365 * 10
     
    [root@www.datang001.com kubernetes-1.18.20]# make WHAT=cmd/kubeadm
    +++ [0515 08:40:42] Building go targets for linux/amd64:
        ./vendor/k8s.io/code-generator/cmd/deepcopy-gen
    +++ [0515 08:40:52] Building go targets for linux/amd64:
        ./vendor/k8s.io/code-generator/cmd/defaulter-gen
    +++ [0515 08:40:59] Building go targets for linux/amd64:
        ./vendor/k8s.io/code-generator/cmd/conversion-gen
    +++ [0515 08:41:11] Building go targets for linux/amd64:
        ./vendor/k8s.io/kube-openapi/cmd/openapi-gen
    +++ [0515 08:41:22] Building go targets for linux/amd64:
        ./vendor/github.com/go-bindata/go-bindata/go-bindata
    warning: ignoring symlink /home/shutang/k8s/update-cert/kubernetes-1.18.20/_output/local/go/src/k8s.io/kubernetes
    go: warning: "k8s.io/kubernetes/vendor/github.com/go-bindata/go-bindata/..." matched no packages
    +++ [0515 08:41:24] Building go targets for linux/amd64:
        cmd/kubeadm
     
    # backup the old kubeadm
    [root@www.datang001.com kubernetes-1.18.20]# mv /usr/bin/kubeadm /usr/bin/kubeadm.old
    [root@www.datang001.com kubernetes-1.18.20]# cp _output/bin/kubeadm /usr/bin/kubeadm
    [root@www.datang001.com kubernetes-1.18.20]# cd /etc/kubernetes/pki/
    [root@www.datang001.com pki]# ls -lah
    total 60K
    drwxr-xr-x 3 root root 4.0K May 14 23:06 .
    drwxr-xr-x 4 root root  125 May 14 23:06 ..
    -rw-r--r-- 1 root root 1.3K May 14 23:06 apiserver.crt
    -rw-r--r-- 1 root root 1.1K May 14 23:06 apiserver-etcd-client.crt
    -rw------- 1 root root 1.7K May 14 23:06 apiserver-etcd-client.key
    -rw------- 1 root root 1.7K May 14 23:06 apiserver.key
    -rw-r--r-- 1 root root 1.1K May 14 23:06 apiserver-kubelet-client.crt
    -rw------- 1 root root 1.7K May 14 23:06 apiserver-kubelet-client.key
    -rw-r--r-- 1 root root 1.1K May 14 23:06 ca.crt
    -rw------- 1 root root 1.7K May 14 23:06 ca.key
    drwxr-xr-x 2 root root  162 May 14 23:06 etcd
    -rw-r--r-- 1 root root 1.1K May 14 23:06 front-proxy-ca.crt
    -rw------- 1 root root 1.7K May 14 23:06 front-proxy-ca.key
    -rw-r--r-- 1 root root 1.1K May 14 23:06 front-proxy-client.crt
    -rw------- 1 root root 1.7K May 14 23:06 front-proxy-client.key
    -rw------- 1 root root 1.7K May 14 23:06 sa.key
    -rw------- 1 root root  451 May 14 23:06 sa.pub
    [root@phx11-gliws-u23 pki]# kubeadm alpha certs renew all
    [renew] Reading configuration from the cluster...
    [renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
     
    certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
    certificate for serving the Kubernetes API renewed
    certificate the apiserver uses to access etcd renewed
    certificate for the API server to connect to kubelet renewed
    certificate embedded in the kubeconfig file for the controller manager to use renewed
    certificate for liveness probes to healthcheck etcd renewed
    certificate for etcd nodes to communicate with each other renewed
    certificate for serving etcd renewed
    certificate for the front proxy client renewed
    certificate embedded in the kubeconfig file for the scheduler manager to use renewed
    [root@phx11-gliws-u23 pki]#
    [root@phx11-gliws-u23 pki]# kubeadm alpha certs check-expiration
    [check-expiration] Reading configuration from the cluster...
    [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
     
    CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
    admin.conf                 May 12, 2032 15:46 UTC   9y                                      no
    apiserver                  May 12, 2032 15:46 UTC   9y              ca                      no
    apiserver-etcd-client      May 12, 2032 15:46 UTC   9y              etcd-ca                 no
    apiserver-kubelet-client   May 12, 2032 15:46 UTC   9y              ca                      no
    controller-manager.conf    May 12, 2032 15:46 UTC   9y                                      no
    etcd-healthcheck-client    May 12, 2032 15:46 UTC   9y              etcd-ca                 no
    etcd-peer                  May 12, 2032 15:46 UTC   9y              etcd-ca                 no
    etcd-server                May 12, 2032 15:46 UTC   9y              etcd-ca                 no
    front-proxy-client         May 12, 2032 15:46 UTC   9y              front-proxy-ca          no
    scheduler.conf             May 12, 2032 15:46 UTC   9y                                      no
     
    CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
    ca                      May 12, 2032 06:06 UTC   9y              no
    etcd-ca                 May 12, 2032 06:06 UTC   9y              no
    front-proxy-ca          May 12, 2032 06:06 UTC   9y              no
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    2.3.14 修改 kube-proxy 的代理模式为 ipvs
    2.3.14.1 查看 kubernetes 的kube-proxy 组件的代理模式
    [root@www.datang001.com ~]# kubectl get pods -n kube-system |grep proxy
    kube-proxy-2qc7r                          1/1     Running   13         182d
    kube-proxy-6nfzm                          1/1     Running   13         182d
    kube-proxy-frwcg                          1/1     Running   15         182d
    kube-proxy-l6xg2                          1/1     Running   13         182d
    kube-proxy-r96hz                          1/1     Running   13         182d
    kube-proxy-sgwfh                          1/1     Running   13         182d
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    随便查看某个 pod 的日志

    [shutang@phx11-gliws-u23 ~]$ kubectl logs -f kube-proxy-2qc7r -n kube-system
    I1021 04:57:16.251139       1 node.go:136] Successfully retrieved node IP: 10.222.175.237
    I1021 04:57:16.251166       1 server_others.go:259] Using iptables Proxier.
    I1021 04:57:16.251599       1 server.go:583] Version: v1.18.20
    I1021 04:57:16.252018       1 conntrack.go:52] Setting nf_conntrack_max to 131072
    I1021 04:57:16.252241       1 config.go:133] Starting endpoints config controller
    I1021 04:57:16.252270       1 shared_informer.go:223] Waiting for caches to sync for endpoints config
    I1021 04:57:16.252300       1 config.go:315] Starting service config controller
    I1021 04:57:16.252306       1 shared_informer.go:223] Waiting for caches to sync for service config
    I1021 04:57:16.352464       1 shared_informer.go:230] Caches are synced for endpoints config
    I1021 04:57:16.352528       1 shared_informer.go:230] Caches are synced for service config
    E1103 07:57:22.505665       1 graceful_termination.go:89] Try delete
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    此时 kube-proxy 的默认代理模式 iptables

    2.3.14.2 设置 kube-proxy 代理模式为 ipvs

    确保 ipvs 模式已经运行

    [shutang@www.datang001.com ~]# lsmod |grep ip_vs
    ip_vs_sh               12688  0
    ip_vs_wrr              12697  0
    ip_vs_rr               12600  167
    ip_vs                 145458  173 ip_vs_rr,ip_vs_sh,ip_vs_wrr
    nf_conntrack          139264  7 ip_vs,nf_nat,nf_nat_ipv4,xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_netlink,nf_conntrack_ipv4
    libcrc32c              12644  4 xfs,ip_vs,nf_nat,nf_conntrack
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7

    如果没有加载 ipvs 模块运行一下命令

    cat > /etc/sysconfig/modules/ipvs.modules <<EOF
    #!/bin/bash
    modprobe -- ip_vs
    modprobe -- ip_vs_rr
    modprobe -- ip_vs_wrr
    modprobe -- ip_vs_sh
    modprobe -- nf_conntrack
    EOF
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    执行下面命令

    chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
    
    • 1

    修改 kube-proxyconfigmap 文件

    ...
    ...
    ...
     kind: KubeProxyConfiguration
        metricsBindAddress: ""
        mode: "ipvs" #修改此处,原为空
        nodePortAddresses: null
        oomScoreAdj: null
        portRange: ""
    ...
    ...
    ...
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

    重启 kube-proxy

    kubectl rollout restart daemonset kube-proxy -n kube-system 
    
    • 1

    2.4 安装集群web管理工具 kubesphere

    3 排除故障总结

  • 相关阅读:
    ZYNQ7000---FLASH读写
    UVA10410 树重建 Tree Reconstruction
    【操作系统】【技巧篇】十进制算术表达式:3*512+7*64+4*8+5的运算结果,用二进制表示
    想在抖音开店粉丝不够怎么办?不足1000粉丝的,来看怎么操作
    【华为设备升级】AR路由器升级设备软件示例
    西门子PLC S7-1200如何实现远程上下载?
    AI极客日报0906 – Twitter更新隐私政策,开始用用户数据训练AI模型;每日GPT教程:如何用ChatGPT生成创新性想法
    Mysql数据类型
    spring boot中的标注@Component、@Service等
    韩顺平linux学习(18-31)
  • 原文地址:https://blog.csdn.net/weixin_48505120/article/details/127804458