• 云原生|kubernetes|kubeadm部署高可用集群(二)---kube-apiserver高可用+etcd外部集群


    前言:

    上一篇文章云原生|kubernetes|kubeadm部署高可用集群(一)使用外部etcd集群_晚风_END的博客-CSDN博客讲述了如何利用kubeadm部署集群的时候使用外部扩展etcd集群,使得集群的整体结构做了一些解耦,提高了集群的负载能力,那么,本文将讲述如何在此基础上做到kube-apiserver的高可用,从而部署一个可完全用于生产的kubernetes集群。

    下面就直接上干货。

    一,集群环境简介

    本次实践计划使用haproxy+keepalived针对kube-apiserver搭建一个负载均衡。为什么不使用nginx+keepalived的技术栈呢?主要是考虑到主机性能不太够,因此虚拟机不够多,其次nginx做负载均衡的时候,会占用集群的默认6443端口,会有一些不必要的麻烦,而haproxy是一个更为专业的代理软件。

    kubernetes集群内部使用三个kube-apiserver+外部扩展etcd集群,各个组件都是高可用的,完全可以应用在实际的生产活动中。

    kubernetes集群是由三个master节点和一个node工作节点组成,master节点使用三个是出于集群的奇数要求,两个或者四个提现不出高可用的特点。工作节点的扩展是比较简单的,如果在实际生产中可以很简单的就扩展工作节点,因此工作节点设置为一个。

    下面是整个集群的大体配置:

    kubernetes高可用集群配置表
    服务器IP地址集群角色服务器硬件配置安装的组件组件版本部署方式
    192.168.217.19control-plane,master

    CPU:2c2u

    内存:4G

    硬盘:100G 单分区 /

    kubelet,kubeadm,kubernetes集群其它关键组件为静态pod形式。

    负载均衡:haproxy和keepalived

    etcd集群

    docker环境

    kubernetes-1.22.2

    docker-ce-20.10.5

    haproxy-1.5.18-9

    keepalived-1.3.5

    etcd Version: 3.4.9

    docker和etcd:二进制方式

    其它:yum安装

    192.168.217.20control-plane,master

    CPU:2c2u

    内存:4G

    硬盘:100G 单分区 /

    kubelet,kubeadm,kubernetes集群其它关键组件为静态pod形式。

    负载均衡:haproxy和keepalived

    etcd集群

    docker环境

    kubernetes-1.22.2

    docker-ce-20.10.5

    haproxy-1.5.18-9

    keepalived-1.3.5

    etcd Version: 3.4.9

    docker和etcd:二进制方式

    其它:yum安装

    192.168.217.21control-plane,master

    CPU:2c2u

    内存:4G

    硬盘:100G 单分区 /

    kubelet,kubeadm,kubernetes集群其它关键组件为静态pod形式。

    负载均衡:haproxy和keepalived

    etcd集群

    docker环境

    kubernetes-1.22.2

    docker-ce-20.10.5

    haproxy-1.5.18-9

    keepalived-1.3.5

    etcd Version: 3.4.9

    docker和etcd:二进制方式

    其它:yum安装

    192.168.217.22node,工作节点

    CPU:2c2u

    内存:4G

    硬盘:100G 单分区 /

    kubelet,kubeadm,kubernetes集群其它关键组件为静态pod形式。

    docker环境

    kubernetes-1.22.2

    docker-ce-20.10.5

    etcd Version: 3.4.9

    docker:二进制方式

    其它:yum安装

    注意:etcd集群是三节点形式,因此,本节点不安装

    VIP:192.168.217.100负载均衡节点keepalived产生
    1. [root@master1 ~]# cat /etc/hosts
    2. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    3. ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    4. 192.168.217.19 master1 k8s-master1
    5. 192.168.217.20 master2 k8s-master2
    6. 192.168.217.21 master3 k8s-master3
    7. 192.168.217.22 node1 k8s-node1

    二,

    集群基础环境搭建

    (1)配置主机名称:

    k8s-master1

    1. [root@master1 ~]# cat /etc/hostname
    2. master1

    k8s-master2

    1. [root@master2 manifests]# cat /etc/hostname
    2. master2

    k8s-master3

    1. [root@master3 ~]# cat /etc/hostname
    2. master3

    node1

    1. [root@node1 ~]# cat /etc/hostname
    2. node1

    所有服务器统一hosts:

    1. [root@master1 ~]# cat /etc/hosts
    2. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    3. ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    4. 192.168.217.19 master1 k8s-master1
    5. 192.168.217.20 master2 k8s-master2
    6. 192.168.217.21 master3 k8s-master3
    7. 192.168.217.22 node1 k8s-node1

    (2)时间服务器

    所有节点安装ntp并启动ntpd服务:

    1. yum install ntp -y
    2. systemctl enable ntpd && systemctl start ntpd

    192.168.217.19作为时间服务器,配置文件主要的地方:

    1. # Use public servers from the pool.ntp.org project.
    2. # Please consider joining the pool (http://www.pool.ntp.org/join.html).
    3. server 127.127.1.0 prefer
    4. fudge 127.127.1.0 stratum 10

    其它节点的时间服务器配置:

    1. # Please consider joining the pool (http://www.pool.ntp.org/join.html).
    2. server 192.168.217.19

    任意一个节点输出如下表示时间服务器正常:

    1. [root@master2 ~]# ntpq -p
    2. remote refid st t when poll reach delay offset jitter
    3. ==============================================================================
    4. *master1 LOCAL(0) 11 u 26 128 377 0.390 0.014 0.053
    5. [root@master2 ~]# ntpstat
    6. synchronised to NTP server (192.168.217.19) at stratum 12
    7. time correct to within 23 ms
    8. polling server every 128 s

    (3)关闭防火墙

    systemctl disable firewalld && systemctl stop firewalld

    (4)关闭selinux

    编辑/etc/selinux/config文件:

    1. [root@master2 ~]# cat /etc/selinux/config
    2. # This file controls the state of SELinux on the system.
    3. # SELINUX= can take one of these three values:
    4. # enforcing - SELinux security policy is enforced.
    5. # permissive - SELinux prints warnings instead of enforcing.
    6. # disabled - No SELinux policy is loaded.
    7. SELINUX=disabled
    8. # SELINUXTYPE= can take one of three two values:
    9. # targeted - Targeted processes are protected,
    10. # minimum - Modification of targeted policy. Only selected processes are protected.
    11. # mls - Multi Level Security protection.
    12. SELINUXTYPE=targeted

    输出如下表示关闭:

    1. [root@master2 ~]# getenforce
    2. Disabled

    (5)关闭swap

    这个就不说了,太基础的东西。

    (6)服务器之间的免密码

    也不在这详细说了,免密的原理和如何免密见我的博客:科普扫盲---ssh免密登陆(ssh的一些小秘密)_晚风_END的博客-CSDN博客_ssh免密不带端口号

    这里配置免密的原因是由于后面初始化集群的时候需要服务器自动传输证书。

    (7)docker环境的搭建

    docker的离线安装以及本地化配置_晚风_END的博客-CSDN博客 务必要做好本地化的配置,否则后面的下载镜像会成为噩梦。

    部署完成后按教程做好docker环境的测试工作。

    (8)etcd集群的搭建

    centos7操作系统 ---ansible剧本离线快速部署etcd集群_晚风_END的博客-CSDN博客_etcd离线安装 务必按此教程部署etcd集群,后面的集群初始化工作会基于此etcd集群开始。

    部署完成后,按教程做好etcd集群的测试工作。

    (9)

    安装kubelet,kubectl,kubeadm

    云原生|kubernetes|kubeadm五分钟内部署完成集群(完全离线部署---适用于centos7全系列)_晚风_END的博客-CSDN博客




    其实集群搭建也就麻烦在基础环境的搭建,费时费力,不过有一个良好的开端的话,后面会比较轻松一些。

    三,创建负载均衡器(HAProxy+Keepalived)

    当存在多个控制平面时,kube-apiserver也存在多个,可以使用Nginx+Keepalived、HAProxy+Keepalived等工具实现多个kube-apiserver的负载均衡和高可用。
    推荐使用HAProxy+Keepalived这个组合,因为HAProxy可以提高更高性能的四层负载均衡功能,这也是大多数人的选择。

    负载均衡架构图: 

    在三个master节点都安装haproxy+keepalived:

    yum install haproxy keepalived -y

    haproxy的配置(绑定9443端口,监听三个apiserver并组成后端,此配置文件复制到其它两个master节点):

    1. cat /etc/haproxy/haproxy.cfg
    2. #---------------------------------------------------------------------
    3. # Example configuration for a possible web application. See the
    4. # full configuration options online.
    5. #
    6. # http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
    7. #
    8. #---------------------------------------------------------------------
    9. #---------------------------------------------------------------------
    10. # Global settings
    11. #---------------------------------------------------------------------
    12. global
    13. # to have these messages end up in /var/log/haproxy.log you will
    14. # need to:
    15. #
    16. # 1) configure syslog to accept network log events. This is done
    17. # by adding the '-r' option to the SYSLOGD_OPTIONS in
    18. # /etc/sysconfig/syslog
    19. #
    20. # 2) configure local2 events to go to the /var/log/haproxy.log
    21. # file. A line like the following can be added to
    22. # /etc/sysconfig/syslog
    23. #
    24. # local2.* /var/log/haproxy.log
    25. #
    26. log 127.0.0.1 local2
    27. chroot /var/lib/haproxy
    28. pidfile /var/run/haproxy.pid
    29. maxconn 4000
    30. user haproxy
    31. group haproxy
    32. daemon
    33. # turn on stats unix socket
    34. stats socket /var/lib/haproxy/stats
    35. #---------------------------------------------------------------------
    36. # common defaults that all the 'listen' and 'backend' sections will
    37. # use if not designated in their block
    38. #---------------------------------------------------------------------
    39. defaults
    40. mode http
    41. log global
    42. option httplog
    43. option dontlognull
    44. option http-server-close
    45. option forwardfor except 127.0.0.0/8
    46. option redispatch
    47. retries 3
    48. timeout http-request 10s
    49. timeout queue 1m
    50. timeout connect 10s
    51. timeout client 1m
    52. timeout server 1m
    53. timeout http-keep-alive 10s
    54. timeout check 10s
    55. maxconn 3000
    56. #---------------------------------------------------------------------
    57. # main frontend which proxys to the backends
    58. #---------------------------------------------------------------------
    59. frontend apiserver
    60. bind *:9443
    61. mode tcp
    62. option tcplog
    63. default_backend apiserver
    64. #---------------------------------------------------------------------
    65. # static backend for serving up images, stylesheets and such
    66. #---------------------------------------------------------------------
    67. backend apiserver
    68. balance roundrobin
    69. server k8s-master1 192.168.217.19:6443 check
    70. server k8s-master2 192.168.217.20:6443 check
    71. server k8s-master3 192.168.217.21:6443 check
    72. #---------------------------------------------------------------------
    73. # round robin balancing between the various backends
    74. #---------------------------------------------------------------------
    75. listen admin_stats
    76. bind 0.0.0.0:9188 #登录页面所绑定的地址加端口
    77. mode http #监控的模式
    78. log 127.0.0.1 local0 err #错误日志等级
    79. stats refresh 30s
    80. stats uri /haproxy-status #登录页面的网址,IP:9188/haproxy-status 即为登录网址
    81. stats realm welcome login\ Haproxy
    82. stats auth admin:admin123 #web页面的用户名和密码
    83. stats hide-version
    84. stats admin if TRUE

    复制haproxy配置文件:

    1. scp /etc/haproxy/haproxy.cfg master2:/etc/haproxy/
    2. scp /etc/haproxy/haproxy.cfg master3:/etc/haproxy/

    打开任意浏览器,输入ip+9188,可以打开haproxy的web界面(账号:admin  密码:admin123):

     

     keepalived的配置:

    master1节点的keepalived的配置文件:

    1. [root@master1 ~]# cat /etc/keepalived/keepalived.conf
    2. ! /etc/keepalived/keepalived.conf
    3. ! Configuration File for keepalived
    4. global_defs {
    5. router_id 192.168.217.19
    6. }
    7. vrrp_script check_haproxy {
    8. script "bash -c 'if [ $(ss -alnupt |grep 9443|wc -l) -eq 0 ];then exit 1;fi'"
    9. interval 3
    10. weight -2
    11. fall 3
    12. rise 3
    13. }
    14. vrrp_instance VI_1 {
    15. state MASTER
    16. interface ens33
    17. virtual_router_id 50
    18. priority 100
    19. authentication {
    20. auth_type PASS
    21. auth_pass 123456
    22. }
    23. virtual_ipaddress {
    24. 192.168.217.100
    25. }
    26. track_script {
    27. check_haproxy
    28. }
    29. }

    master节点的keepalived的配置文件:

    1. [root@master2 ~]# cat /etc/keepalived/keepalived.conf
    2. ! /etc/keepalived/keepalived.conf
    3. ! Configuration File for keepalived
    4. global_defs {
    5. router_id 192.168.217.20
    6. }
    7. vrrp_script check_haproxy {
    8. script "bash -c 'if [ $(ss -alnupt |grep 9443|wc -l) -eq 0 ];then exit 1;fi'"
    9. interval 3
    10. weight -2
    11. fall 3
    12. rise 3
    13. }
    14. vrrp_instance VI_1 {
    15. state BACKUP
    16. interface ens33
    17. virtual_router_id 50
    18. priority 99
    19. authentication {
    20. auth_type PASS
    21. auth_pass 123456
    22. }
    23. virtual_ipaddress {
    24. 192.168.217.100
    25. }
    26. track_script {
    27. check_haproxy
    28. }
    29. }

     master3节点的keepalived的配置文件:

    1. [root@master3 ~]# cat /etc/keepalived/keepalived.conf
    2. ! /etc/keepalived/keepalived.conf
    3. ! Configuration File for keepalived
    4. global_defs {
    5. router_id 192.168.217.21
    6. }
    7. vrrp_script check_haproxy {
    8. script "bash -c 'if [ $(ss -alnupt |grep 9443|wc -l) -eq 0 ];then exit 1;fi'"
    9. interval 3
    10. weight -2
    11. fall 3
    12. rise 3
    13. }
    14. vrrp_instance VI_1 {
    15. state BACKUP
    16. interface ens33
    17. virtual_router_id 50
    18. priority 98
    19. authentication {
    20. auth_type PASS
    21. auth_pass 123456
    22. }
    23. virtual_ipaddress {
    24. 192.168.217.100
    25. }
    26. track_script {
    27. check_haproxy
    28. }
    29. }

    两个服务的自启和启动:

    systemctl enable keepalived haproxy && systemctl start keepalived haproxy

    负载均衡的测试:

    查看端口是否开启:

    1. [root@master1 ~]# netstat -antup |grep 9443
    2. tcp 0 0 0.0.0.0:9443 0.0.0.0:* LISTEN 1011/haproxy

     在master1上查看vip(为什么在master1呢?因为它的优先级是100):

    1. [root@master1 ~]# ip a
    2. 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    3. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    4. inet 127.0.0.1/8 scope host lo
    5. valid_lft forever preferred_lft forever
    6. inet6 ::1/128 scope host
    7. valid_lft forever preferred_lft forever
    8. 2: ens33: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    9. link/ether 00:0c:29:5c:1e:66 brd ff:ff:ff:ff:ff:ff
    10. inet 192.168.217.19/24 brd 192.168.217.255 scope global ens33
    11. valid_lft forever preferred_lft forever
    12. inet 192.168.217.100/32 scope global ens33
    13. valid_lft forever preferred_lft forever
    14. inet6 fe80::20c:29ff:fe5c:1e66/64 scope link
    15. valid_lft forever preferred_lft forever
    16. 3: docker0: mtu 1500 qdisc noqueue state DOWN
    17. link/ether 02:42:f3:9c:2f:92 brd ff:ff:ff:ff:ff:ff
    18. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
    19. valid_lft forever preferred_lft forever

    关闭master1的haproxy服务,查看vip是否漂移到master2:

    [root@master1 ~]# systemctl stop haproxy

    在master2上查看:

    1. [root@master2 ~]# ip a
    2. 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    3. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    4. inet 127.0.0.1/8 scope host lo
    5. valid_lft forever preferred_lft forever
    6. inet6 ::1/128 scope host
    7. valid_lft forever preferred_lft forever
    8. 2: ens33: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    9. link/ether 00:0c:29:86:30:e6 brd ff:ff:ff:ff:ff:ff
    10. inet 192.168.217.20/24 brd 192.168.217.255 scope global ens33
    11. valid_lft forever preferred_lft forever
    12. inet 192.168.217.100/32 scope global ens33
    13. valid_lft forever preferred_lft forever
    14. inet6 fe80::20c:29ff:fe86:30e6/64 scope link
    15. valid_lft forever preferred_lft forever
    16. 3: docker0: mtu 1500 qdisc noqueue state DOWN
    17. link/ether 02:42:8a:27:b6:7f brd ff:ff:ff:ff:ff:ff
    18. inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
    19. valid_lft forever preferred_lft forever

    漂移成功,这一步一定要成功才可以进行下一步。

    在master1上恢复haproxy服务,可以看到很迅速的vip就回来了:

    1. [root@master1 ~]# systemctl start haproxy
    2. [root@master1 ~]# ip a
    3. 1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    4. link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    5. inet 127.0.0.1/8 scope host lo
    6. valid_lft forever preferred_lft forever
    7. inet6 ::1/128 scope host
    8. valid_lft forever preferred_lft forever
    9. 2: ens33: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
    10. link/ether 00:0c:29:5c:1e:66 brd ff:ff:ff:ff:ff:ff
    11. inet 192.168.217.19/24 brd 192.168.217.255 scope global ens33
    12. valid_lft forever preferred_lft forever
    13. inet 192.168.217.100/32 scope global ens33
    14. valid_lft forever preferred_lft forever
    15. inet6 fe80::20c:29ff:fe5c:1e66/64 scope link
    16. valid_lft forever preferred_lft forever

     查看master1的系统日志,看看vip漂移的过程:

    1. Oct 27 19:01:28 master1 Keepalived_vrrp[1037]: /usr/bin/bash -c 'if [ $(ss -alnupt |grep 9443|wc -l) -eq 0 ];then exit 1;fi' exited with status 1
    2. Oct 27 19:01:31 master1 Keepalived_vrrp[1037]: /usr/bin/bash -c 'if [ $(ss -alnupt |grep 9443|wc -l) -eq 0 ];then exit 1;fi' exited with status 1
    3. Oct 27 19:01:34 master1 Keepalived_vrrp[1037]: /usr/bin/bash -c 'if [ $(ss -alnupt |grep 9443|wc -l) -eq 0 ];then exit 1;fi' exited with status 1
    4. Oct 27 19:01:37 master1 Keepalived_vrrp[1037]: /usr/bin/bash -c 'if [ $(ss -alnupt |grep 9443|wc -l) -eq 0 ];then exit 1;fi' exited with status 1
    5. Oct 27 19:01:38 master1 systemd: Started HAProxy Load Balancer.
    6. Oct 27 19:01:38 master1 systemd: Starting HAProxy Load Balancer...
    7. Oct 27 19:01:38 master1 haproxy-systemd-wrapper: [WARNING] 299/190138 (31426) : config : 'option forwardfor' ignored for frontend 'apiserver' as it requires HTTP mode.
    8. Oct 27 19:01:46 master1 Keepalived_vrrp[1037]: VRRP_Script(check_haproxy) succeeded
    9. Oct 27 19:01:47 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) Changing effective priority from 98 to 100
    10. Oct 27 19:01:47 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) forcing a new MASTER election
    11. Oct 27 19:01:48 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) Transition to MASTER STATE
    12. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) Entering MASTER STATE
    13. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) setting protocol VIPs.
    14. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: Sending gratuitous ARP on ens33 for 192.168.217.100
    15. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens33 for 192.168.217.100
    16. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: Sending gratuitous ARP on ens33 for 192.168.217.100
    17. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: Sending gratuitous ARP on ens33 for 192.168.217.100
    18. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: Sending gratuitous ARP on ens33 for 192.168.217.100
    19. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: Sending gratuitous ARP on ens33 for 192.168.217.100
    20. Oct 27 19:01:50 master1 ntpd[757]: Listen normally on 10 ens33 192.168.217.100 UDP 123

    其中关于选举的部分是这样的:

    • 脚本执行成功
    • 由于脚本成功,优先级从98跳跃到100
    • 强制重新选举
    • master1当选为master
    • master1的keepalived进入master状态
    1. ct 27 19:01:46 master1 Keepalived_vrrp[1037]: VRRP_Script(check_haproxy) succeeded
    2. Oct 27 19:01:47 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) Changing effective priority from 98 to 100
    3. Oct 27 19:01:47 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) forcing a new MASTER election
    4. Oct 27 19:01:48 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) Transition to MASTER STATE
    5. Oct 27 19:01:49 master1 Keepalived_vrrp[1037]: VRRP_Instance(VI_1) Entering MASTER STATE

    四,etcd集群的证书传递

    要使用外部etcd集群需要先安装etcd集群,然后etcd的证书要分发到全部节点中,本例是四个节点,都需要分发,分发命令为:

    在所有节点建立相关目录(四个服务器都执行):

    mkdir -p /etc/kubernetes/pki/etcd/

     在master1节点复制etcd的证书到上述建立的目录内:

    1. cp /opt/etcd/ssl/ca.pem /etc/kubernetes/pki/etcd/
    2. cp /opt/etcd/ssl/server.pem /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem
    3. cp /opt/etcd/ssl/server-key.pem /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem

    分发etcd证书到各个节点:

    1. scp /etc/kubernetes/pki/etcd/* master2:/etc/kubernetes/pki/etcd/
    2. scp /etc/kubernetes/pki/etcd/* master3:/etc/kubernetes/pki/etcd/
    3. scp /etc/kubernetes/pki/etcd/* node1:/etc/kubernetes/pki/etcd/

    五,

    kubeadm部署集群可以使用两种形式,第一,命令行初始化,第二,config配置文件初始化。本文选择使用config配置文件的方式:

    在192.168.217.19服务器上生成模板文件;

    kubeadm config print init-defaults  >kubeadm-init1-ha.yaml

    编辑此模板文件,最终的内容应该如下:

    1. [root@master1 ~]# cat kubeadm-init-ha.yaml
    2. apiVersion: kubeadm.k8s.io/v1beta3
    3. bootstrapTokens:
    4. - groups:
    5. - system:bootstrappers:kubeadm:default-node-token
    6. token: abcdef.0123456789abcdef
    7. ttl: "0"
    8. usages:
    9. - signing
    10. - authentication
    11. kind: InitConfiguration
    12. localAPIEndpoint:
    13. advertiseAddress: 192.168.217.19
    14. bindPort: 6443
    15. nodeRegistration:
    16. criSocket: /var/run/dockershim.sock
    17. imagePullPolicy: IfNotPresent
    18. name: master1
    19. taints: null
    20. ---
    21. controlPlaneEndpoint: "192.168.217.100"
    22. apiServer:
    23. timeoutForControlPlane: 4m0s
    24. apiVersion: kubeadm.k8s.io/v1beta3
    25. certificatesDir: /etc/kubernetes/pki
    26. clusterName: kubernetes
    27. controllerManager: {}
    28. dns: {}
    29. etcd:
    30. external:
    31. endpoints: #下面为自定义etcd集群地址
    32. - https://192.168.217.19:2379
    33. - https://192.168.217.20:2379
    34. - https://192.168.217.21:2379
    35. caFile: /etc/kubernetes/pki/etcd/ca.pem
    36. certFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem
    37. keyFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem
    38. imageRepository: registry.aliyuncs.com/google_containers
    39. kind: ClusterConfiguration
    40. kubernetesVersion: 1.22.2
    41. networking:
    42. dnsDomain: cluster.local
    43. podSubnet: "10.244.0.0/16"
    44. serviceSubnet: "10.96.0.0/12"
    45. scheduler: {}

    kubeadm-init-ha.yaml配置文件说明:

    配置说明:

    name: 同 /etc/hosts 中统一设置的 hostname 一致,因计划在master1节点执行初始化,因此,是master1的主机名
    controlPlaneEndpoint:为vip地址,可以不需要指定端口
    imageRepository:由于国内无法访问google镜像仓库k8s.gcr.io,这里指定为阿里云镜像仓库registry.aliyuncs.com/google_containers
    podSubnet:指定的IP地址段与后续部署的网络插件相匹配,这里需要部署flannel插件,所以配置为10.244.0.0/16

     

    初始化命令以及该命令的输出日志(在master1节点上执行):

    1. [root@master1 ~]# kubeadm init --config=kubeadm-init-ha.yaml --upload-certs
    2. [init] Using Kubernetes version: v1.22.2
    3. [preflight] Running pre-flight checks
    4. [preflight] Pulling images required for setting up a Kubernetes cluster
    5. [preflight] This might take a minute or two, depending on the speed of your internet connection
    6. [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    7. [certs] Using certificateDir folder "/etc/kubernetes/pki"
    8. [certs] Generating "ca" certificate and key
    9. [certs] Generating "apiserver" certificate and key
    10. [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master1] and IPs [10.96.0.1 192.168.217.19 192.168.217.100]
    11. [certs] Generating "apiserver-kubelet-client" certificate and key
    12. [certs] Generating "front-proxy-ca" certificate and key
    13. [certs] Generating "front-proxy-client" certificate and key
    14. [certs] External etcd mode: Skipping etcd/ca certificate authority generation
    15. [certs] External etcd mode: Skipping etcd/server certificate generation
    16. [certs] External etcd mode: Skipping etcd/peer certificate generation
    17. [certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation
    18. [certs] External etcd mode: Skipping apiserver-etcd-client certificate generation
    19. [certs] Generating "sa" key and public key
    20. [kubeconfig] Using kubeconfig folder "/etc/kubernetes"
    21. [kubeconfig] Writing "admin.conf" kubeconfig file
    22. [kubeconfig] Writing "kubelet.conf" kubeconfig file
    23. [kubeconfig] Writing "controller-manager.conf" kubeconfig file
    24. [kubeconfig] Writing "scheduler.conf" kubeconfig file
    25. [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    26. [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    27. [kubelet-start] Starting the kubelet
    28. [control-plane] Using manifest folder "/etc/kubernetes/manifests"
    29. [control-plane] Creating static Pod manifest for "kube-apiserver"
    30. [control-plane] Creating static Pod manifest for "kube-controller-manager"
    31. [control-plane] Creating static Pod manifest for "kube-scheduler"
    32. [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
    33. [apiclient] All control plane components are healthy after 11.508717 seconds
    34. [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
    35. [kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
    36. [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
    37. [upload-certs] Using certificate key:
    38. d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643
    39. [mark-control-plane] Marking the node master1 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
    40. [mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
    41. [bootstrap-token] Using token: abcdef.0123456789abcdef
    42. [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
    43. [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
    44. [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
    45. [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
    46. [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
    47. [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
    48. [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
    49. [addons] Applied essential addon: CoreDNS
    50. [addons] Applied essential addon: kube-proxy
    51. Your Kubernetes control-plane has initialized successfully!
    52. To start using your cluster, you need to run the following as a regular user:
    53. mkdir -p $HOME/.kube
    54. sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    55. sudo chown $(id -u):$(id -g) $HOME/.kube/config
    56. Alternatively, if you are the root user, you can run:
    57. export KUBECONFIG=/etc/kubernetes/admin.conf
    58. You should now deploy a pod network to the cluster.
    59. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
    60. https://kubernetes.io/docs/concepts/cluster-administration/addons/
    61. You can now join any number of the control-plane node running the following command on each as root:
    62. kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \
    63. --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e \
    64. --control-plane --certificate-key d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643
    65. Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
    66. As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
    67. "kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
    68. Then you can join any number of worker nodes by running the following on each as root:
    69. kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \
    70. --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e

    可以看到有两个join命令,这两个join命令是有不同用途的,

    第一个join命令:

    此命令是用于master节点加入的,也就是带有apiserver的节点加入,因此,在本例中,这个join命令应该用在master2和master3这两个服务器上

    1. kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \
    2. --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e \
    3. --control-plane --certificate-key d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643

    在master2节点上执行此join命令,输入日志如下:

    1. [root@master2 ~]# kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e --control-plane --certificate-key d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643
    2. [preflight] Running pre-flight checks
    3. [preflight] Reading configuration from the cluster...
    4. [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
    5. [preflight] Running pre-flight checks before initializing the new control plane instance
    6. [preflight] Pulling images required for setting up a Kubernetes cluster
    7. [preflight] This might take a minute or two, depending on the speed of your internet connection
    8. [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    9. [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
    10. [certs] Using certificateDir folder "/etc/kubernetes/pki"
    11. [certs] Generating "apiserver" certificate and key
    12. [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master2] and IPs [10.96.0.1 192.168.217.20 192.168.217.100]
    13. [certs] Generating "apiserver-kubelet-client" certificate and key
    14. [certs] Generating "front-proxy-client" certificate and key
    15. [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
    16. [certs] Using the existing "sa" key
    17. [kubeconfig] Generating kubeconfig files
    18. [kubeconfig] Using kubeconfig folder "/etc/kubernetes"
    19. [kubeconfig] Writing "admin.conf" kubeconfig file
    20. [kubeconfig] Writing "controller-manager.conf" kubeconfig file
    21. [kubeconfig] Writing "scheduler.conf" kubeconfig file
    22. [control-plane] Using manifest folder "/etc/kubernetes/manifests"
    23. [control-plane] Creating static Pod manifest for "kube-apiserver"
    24. [control-plane] Creating static Pod manifest for "kube-controller-manager"
    25. [control-plane] Creating static Pod manifest for "kube-scheduler"
    26. [check-etcd] Skipping etcd check in external mode
    27. [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    28. [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    29. [kubelet-start] Starting the kubelet
    30. [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
    31. [control-plane-join] using external etcd - no local stacked instance added
    32. The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
    33. [mark-control-plane] Marking the node master2 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
    34. [mark-control-plane] Marking the node master2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
    35. This node has joined the cluster and a new control plane instance was created:
    36. * Certificate signing request was sent to apiserver and approval was received.
    37. * The Kubelet was informed of the new secure connection details.
    38. * Control plane (master) label and taint were applied to the new node.
    39. * The Kubernetes control plane instances scaled up.
    40. To start administering your cluster from this node, you need to run the following as a regular user:
    41. mkdir -p $HOME/.kube
    42. sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    43. sudo chown $(id -u):$(id -g) $HOME/.kube/config
    44. Run 'kubectl get nodes' to see this node join the cluster.

    关键代码:

    此时master2作为control-plane角色加入了集群,并且打了NoSchedule的污点。

    1. [control-plane-join] using external etcd - no local stacked instance added
    2. The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
    3. [mark-control-plane] Marking the node master2 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
    4. [mark-control-plane] Marking the node master2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

    同样的在master3服务器上执行此命令,输出基本相同就不废话了。

    工作节点join命令:

    1. kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \
    2. --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e

    此命令在node1节点执行,当然,本例是只有一个工作节点,后续要在添加工作节点,仍然使用此命令即可。

    1. kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \
    2. --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e

    此命令输出日志如下:

    1. [root@node1 ~]# kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e
    2. [preflight] Running pre-flight checks
    3. [preflight] Reading configuration from the cluster...
    4. [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
    5. [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
    6. [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
    7. [kubelet-start] Starting the kubelet
    8. [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
    9. This node has joined the cluster:
    10. * Certificate signing request was sent to apiserver and a response was received.
    11. * The Kubelet was informed of the new secure connection details.
    12. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

    关键代码:

    这个节点加入了集群,并且向apiserver发出了证书申请,申请被接受并返回给此节点了。

    1. This node has joined the cluster:
    2. * Certificate signing request was sent to apiserver and a response was received



    附:1:kubeadm部署完成后的环境变量问题

    在上述的master初始化节点日志中,有如下输出:

    1. To start using your cluster, you need to run the following as a regular user:
    2. mkdir -p $HOME/.kube
    3. sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    4. sudo chown $(id -u):$(id -g) $HOME/.kube/config
    5. Alternatively, if you are the root user, you can run:
    6. export KUBECONFIG=/etc/kubernetes/admin.conf

    建议只在一个节点使用上述命令,例如,只在master1节点上做集群管理工作,那么就在master1节点上执行:

    1. mkdir -p $HOME/.kube
    2. sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    3. sudo chown $(id -u):$(id -g) $HOME/.kube/config

    如果指定在工作节点做集群管理工作,需要将任意一个master节点的/etc/kubernetes/admin.conf 文件拷贝到工作节点的同名目录下,本例是node1节点:

    master1节点拷贝文件:

    1. [root@master1 ~]# scp /etc/kubernetes/admin.conf node1:/etc/kubernetes/
    2. admin.conf

     在node1节点上执行以下命令:

    1. [root@node1 ~]# mkdir -p $HOME/.kube
    2. [root@node1 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    3. [root@node1 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config

     集群管理:

    1. [root@node1 ~]# kubectl get no,po -A
    2. NAME STATUS ROLES AGE VERSION
    3. node/master1 Ready control-plane,master 22h v1.22.2
    4. node/master2 Ready control-plane,master 22h v1.22.2
    5. node/master3 Ready control-plane,master 21h v1.22.2
    6. node/node1 Ready 21h v1.22.2
    7. NAMESPACE NAME READY STATUS RESTARTS AGE
    8. default pod/dns-test 0/1 Completed 0 6h11m
    9. default pod/nginx-7fb9867-jfg2r 1/1 Running 2 6h6m
    10. kube-system pod/coredns-7f6cbbb7b8-7c85v 1/1 Running 4 20h
    11. kube-system pod/coredns-7f6cbbb7b8-h9wtb 1/1 Running 4 20h
    12. kube-system pod/kube-apiserver-master1 1/1 Running 7 22h
    13. kube-system pod/kube-apiserver-master2 1/1 Running 6 22h
    14. kube-system pod/kube-apiserver-master3 0/1 Running 5 21h
    15. kube-system pod/kube-controller-manager-master1 1/1 Running 0 71m
    16. kube-system pod/kube-controller-manager-master2 1/1 Running 0 67m
    17. kube-system pod/kube-controller-manager-master3 0/1 Running 0 3s
    18. kube-system pod/kube-flannel-ds-7dmgh 1/1 Running 6 21h
    19. kube-system pod/kube-flannel-ds-b99h7 1/1 Running 19 (70m ago) 21h
    20. kube-system pod/kube-flannel-ds-hmzj7 1/1 Running 5 21h
    21. kube-system pod/kube-flannel-ds-vvld8 0/1 PodInitializing 4 (140m ago) 21h
    22. kube-system pod/kube-proxy-nkgdf 1/1 Running 5 21h
    23. kube-system pod/kube-proxy-rb9zk 1/1 Running 5 21h
    24. kube-system pod/kube-proxy-rvbb7 1/1 Running 7 22h
    25. kube-system pod/kube-proxy-xmrp5 1/1 Running 6 22h
    26. kube-system pod/kube-scheduler-master1 1/1 Running 0 71m
    27. kube-system pod/kube-scheduler-master2 1/1 Running 0 67m
    28. kube-system pod/kube-scheduler-master3 0/1 Running 0 3s

    附2:kubeadm部署的证书期限问题

    1. [root@master1 ~]# kubeadm certs check-expiration
    2. [check-expiration] Reading configuration from the cluster...
    3. [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
    4. CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
    5. admin.conf Oct 27, 2023 11:19 UTC 364d no
    6. apiserver Oct 27, 2023 11:19 UTC 364d ca no
    7. apiserver-kubelet-client Oct 27, 2023 11:19 UTC 364d ca no
    8. controller-manager.conf Oct 27, 2023 11:19 UTC 364d no
    9. front-proxy-client Oct 27, 2023 11:19 UTC 364d front-proxy-ca no
    10. scheduler.conf Oct 27, 2023 11:19 UTC 364d no
    11. CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
    12. ca Oct 24, 2032 11:19 UTC 9y no
    13. front-proxy-ca Oct 24, 2032 11:19 UTC 9y no

    这个证书期限问题后面单独写一个博文解释如何修改。 

    附3:kubeadm部署的一个bug

    利用kubeadm部署的集群的状态检查可以看到是不正确的:

    1. [root@node1 ~]# kubectl get cs
    2. Warning: v1 ComponentStatus is deprecated in v1.19+
    3. NAME STATUS MESSAGE ERROR
    4. scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
    5. controller-manager Healthy ok
    6. etcd-1 Healthy {"health":"true"}
    7. etcd-2 Healthy {"health":"true"}
    8. etcd-0 Healthy {"health":"true"}

    在三个master节点都操作一次:

    1. [root@master1 ~]# cd /etc/kubernetes/manifests/
    2. [root@master1 manifests]# ll
    3. total 12
    4. -rw------- 1 root root 3452 Oct 27 19:19 kube-apiserver.yaml
    5. -rw------- 1 root root 2893 Oct 27 19:19 kube-controller-manager.yaml
    6. -rw------- 1 root root 1479 Oct 27 19:19 kube-scheduler.yaml

    编辑kube-controller-manager.yaml和kube-scheduler.yaml这两个文件,将--port=0 字段删除即可,任何服务都不需要重启,即可看到集群状态正常了:

    注意,此命令仅仅是快速查看集群的状态,知道集群的组件是否正常,当然也就知道有哪些组件了,可以看到我这个集群是有一个三节点的etcd集群。

    1. [root@node1 ~]# kubectl get cs
    2. Warning: v1 ComponentStatus is deprecated in v1.19+
    3. NAME STATUS MESSAGE ERROR
    4. scheduler Healthy ok
    5. controller-manager Healthy ok
    6. etcd-2 Healthy {"health":"true"}
    7. etcd-0 Healthy {"health":"true"}
    8. etcd-1 Healthy {"health":"true"}

  • 相关阅读:
    Windows Server 系统各版本及授权说明(附下载地址
    03 C++ 字符串、向量和数组
    C#练习题1和2
    【数据结构】【程序填空】赫夫曼编码
    Java_Validation分组校验
    VB自定义版影音播放器
    简单博客网页
    基于MATLAB的数字图像处理(指纹增强)
    如何使用ArcGIS Pro进行选房分析
    如何成为shopee店铺优选卖家—扬帆凌远
  • 原文地址:https://blog.csdn.net/alwaysbefine/article/details/127528851