至此,问题原因仍毫无头绪。只好找来 calico node的启动代码 来看看。
期间发现,calico node启动时是依据CALICO_STARTUP_LOGLEVEL
环境变量来设置log级别。考虑到出问题的calico node输出的log实在是太少,修改calico node的daemonset配置,将log等级设置成最低的DEBUG级别:
spec:
containers:
- env:
- name: CALICO_STARTUP_LOGLEVEL
value: DEBUG
重启问题节点的calico-node,果然输出了更多的log:
2022-07-31 13:24:50.867 [DEBUG][9] startup/interfaces.go 78: Querying interface addresses Interface="ens192"
2022-07-31 13:24:50.867 [DEBUG][9] startup/interfaces.go 98: Found valid IP address and network CIDR=fe80::250:56ff:fe91:4143/64
2022-07-31 13:24:50.867 [DEBUG][9] startup/filtered.go 42: Check interface Name="ens192"
2022-07-31 13:24:50.867 [DEBUG][9] startup/filtered.go 44: Check address CIDR=fe80::250:56ff:fe91:4143/64
2022-07-31 13:24:50.867 [WARNING][9] startup/startup.go 709: Unable to auto-detect an IPv6 address: no valid IPv6 addresses found on the host interfaces
2022-07-31 13:24:50.867 [WARNING][9] startup/startup.go 509: Couldn't autodetect an IPv6 address. If auto-detecting, choose a different autodetection method. Otherwise provide an explicit address.
2022-07-31 13:24:50.867 [INFO][9] startup/startup.go 360: Clearing out-of-date IPv4 address from this node IP="10.10.32.23/24"
2022-07-31 13:24:50.867 [INFO][9] startup/startup.go 364: Clearing out-of-date IPv6 address from this node IP=""
2022-07-31 13:24:50.867 [DEBUG][9] startup/k8s.go 533: Performing 'Update' for &{Key:Node(node01) Value:0xc0000cfa00 Revision:12283 UID: TTL:0s}
2022-07-31 13:24:50.867 [DEBUG][9] startup/node.go 75: Received Update request on Node type
2022-07-31 13:24:50.869 [DEBUG][9] startup/node.go 471: Loaded label annotations k8s=map[string]string{"beta.kubernetes.io/arch":"amd64", "beta.kubernetes.io/os":"linux", "kubernetes.io/arch":"amd64", "kubernetes.io/hostname":"node01", "kubernetes.io/os":"linux"}
从上面错误日志可以看出:
- 虽然网卡有 ipv6 地址,但是 calico-node 无法获取到,目前不清楚原因
- 因此尝试关闭对 ipv6 的支持,calico-node 变为正常
# 6. 关键点:调整部署参数,支持ipv6:
# !!! cluster-1 开启 ipv6 没问题,cluster-2 开启 ipv6 有问题,此处建议「先不开启 ipv6 也就是 false」
# !!! 待排查原因
# 采用 kubespray 1.18 版本
vi inventory/mizarcluster(此处名字替换为自己的)/group_vars/k8s_cluster/k8s-cluster.yml
enable_dual_stack_networks: false