• 在KubeSphere启动服务网格Istio并解决解决ContainerCreating问题


    KubeSphere 服务网格

    官网介绍:

    KubeSphere 服务网格基于Istio,将微服务治理和流量管理可视化。它拥有强大的工具包,包括熔断机制、蓝绿部署、金丝雀发布、流量镜像、链路追踪、可观测性和流量控制等。KubeSphere 服务网格支持代码无侵入的微服务治理,帮助开发者快速上手,Istio 的学习曲线也极大降低。KubeSphere 服务网格的所有功能都旨在满足用户的业务需求。

    当前3.3.0版本的 KubeSphere 暂不支持为多集群应用创建灰度发布策略。单集群或者最简单的All-in-one虚拟机还是可以用的。有多集群灰度发布需求暂时不能考虑KubeSphere,需要自行搭建K8S集群及Istio等组件,自己想办法解决。

    Istio

    官网:https://istio.io/

    中文官网:https://istio.io/latest/zh/

    Istio是个很常见的服务网格组件,主要是负载均衡流量管控等功能。

    灰度发布

    KubeSphere官网概述:https://kubesphere.com.cn/docs/v3.3/project-user-guide/grayscale-release/overview/

    所谓的灰度发布,其实就是通过不同的发布策略,将老旧的微服务替换为新版本微服务,且升级过程中遇到问题时风险更小,尽量减少对prod环境的影响。

    蓝绿部署

    KubeSphere官网介绍:https://kubesphere.com.cn/docs/v3.3/project-user-guide/grayscale-release/blue-green-deployment/

    蓝绿部署会创建一个相同的备用环境,在该环境中运行新的应用版本,从而为发布新版本提供一个高效的方式,不会出现宕机或者服务中断。通过这种方法,KubeSphere 将所有流量路由至其中一个版本,即在任意给定时间只有一个环境接收流量。如果新构建版本出现任何问题,可以立刻回滚至先前版本。

    这种发布策略很容易理解,就是创建备份,如果新版本不稳定或者功能、性能不达标,就立即回撤到老版本。Istio切换流量转发比一定是较人工重新部署上线更加迅速。

    金丝雀发布

    KubeSphere官网介绍:https://kubesphere.com.cn/docs/v3.3/project-user-guide/grayscale-release/canary-release/

    金丝雀部署缓慢地向一小部分用户推送变更,从而将版本升级的风险降到最低。具体来讲,可以在高度响应的仪表板上进行定义,选择将新的应用版本暴露给一部分生产流量。另外,执行金丝雀部署后,KubeSphere 会监控请求,提供实时流量的可视化视图。在整个过程中,可以分析新的应用版本的行为,选择逐渐增加向它发送的流量比例。待对构建版本有把握后,便可以把所有流量路由至该构建版本。

    这种发布策略就类似与各种网游的内测、封测、公测、正式运营。让各种精英用户率先体验,再逐步扩大测试范围,直到稳定运行。

    流量镜像

    KubeSphere官网介绍:https://kubesphere.com.cn/docs/v3.3/project-user-guide/grayscale-release/traffic-mirroring/

    流量镜像复制实时生产流量并发送至镜像服务。默认情况下,KubeSphere 会镜像所有流量,也可以指定一个值来手动定义镜像流量的百分比。常见用例包括:

    • 测试新的应用版本。可以对比镜像流量和生产流量的实时输出。
    • 测试集群。可以将实例的生产流量用于集群测试。
    • 测试数据库。可以使用空数据库来存储和加载数据。

    这种发布策略就是将同一份prod环境的流量请求发送到镜像服务,类似MQ中消息的1对多分发。这种方式只会占用网络带宽、CPU时间片、内存、硬盘等资源,但是只要资源充足没有遇到性能瓶颈,就不会影响到prod环境。相同的流量请求转发到镜像服务后,便可以利用prod的数据做功能测试性能压测

    启动服务网格Istio

    KubeKey中文Github文档:https://github.com/kubesphere/kubekey/blob/master/README_zh-CN.md

    KubeSphere官方文档:https://kubesphere.com.cn/docs/v3.3/installing-on-linux/introduction/multioverview/

    在安装KubeSphere之前就可以修改配置,这样安装好KubeSphere后,正常情况会自动启动Istio。这是通过KubeKey实现的。

    安装KubeSphere前配置启动

    Linux安装KubeSphere时

    root@zhiyong-ksp1:/home/zhiyong/kubesphereinstall# ./kk create config
    Generate KubeKey config file successfully
    root@zhiyong-ksp1:/home/zhiyong/kubesphereinstall# ll
    总用量 70344
    drwxrwxr-x  3 zhiyong zhiyong     4096 816 23:14 ./
    drwxr-xr-x 16 zhiyong zhiyong     4096 816 23:12 ../
    -rw-r--r--  1 root    root        1065 816 23:14 config-sample.yaml
    -rwxr-xr-x  1 zhiyong zhiyong 54910976 726 14:17 kk*
    drwxr-xr-x 12 root    root        4096 88 10:04 kubekey/
    -rw-rw-r--  1 zhiyong zhiyong 17102249 88 01:03 kubekey-v2.2.2-linux-amd64.tar.gz
    root@zhiyong-ksp1:/home/zhiyong/kubesphereinstall# cat config-sample.yaml
    
    apiVersion: kubekey.kubesphere.io/v1alpha2
    kind: Cluster
    metadata:
      name: sample
    spec:
      hosts:
      - {name: node1, address: 172.16.0.2, internalAddress: 172.16.0.2, user: ubuntu, password: "Qcloud@123"}
      - {name: node2, address: 172.16.0.3, internalAddress: 172.16.0.3, user: ubuntu, password: "Qcloud@123"}
      roleGroups:
        etcd:
        - node1
        control-plane:
        - node1
        worker:
        - node1
        - node2
      controlPlaneEndpoint:
        ## Internal loadbalancer for apiservers
        # internalLoadbalancer: haproxy
    
        domain: lb.kubesphere.local
        address: ""
        port: 6443
      kubernetes:
        version: v1.23.8
        clusterName: cluster.local
        autoRenewCerts: true
        containerManager: docker
      etcd:
        type: kubekey
      network:
        plugin: calico
        kubePodsCIDR: 10.233.64.0/18
        kubeServiceCIDR: 10.233.0.0/18
        ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni
        multusCNI:
          enabled: false
      registry:
        privateRegistry: ""
        namespaceOverride: ""
        registryMirrors: []
        insecureRegistries: []
      addons: []
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55

    笔者创建的是默认的yaml配置文件。

    还需要有如下内容:

    servicemesh:
    enabled: true # 将“false”更改为“true”。
    istio: # Customizing the istio installation configuration, refer to https://istio.io/latest/docs/setup/additional-setup/customize-installation/
      components:
        ingressGateways:
        - name: istio-ingressgateway # 将服务暴露至服务网格之外。默认不开启。
          enabled: false
        cni:
          enabled: false # 启用后,会在 Kubernetes pod 生命周期的网络设置阶段完成 Istio 网格的 pod 流量转发设置工作。
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    之后执行:

    ./kk create cluster -f config-sample.yaml
    
    • 1

    KubeKey会自动在Linux服务器创建一个包含Istio组件的K8S集群。

    当然也可以手动安装Istio这种CNI组件。

    Istio中文官网安装文档:https://istio.io/latest/zh/docs/setup/additional-setup/cni/

    K8S集群安装KubeSphere时

    由于KubeSphere既可以运行在Linux服务器,又可以直接运行在K8S的pod中,故已经有K8S集群时,也可以在安装KubeSphere时启动服务网格组件Istio。方法大同小异。

    vim cluster-configuration.yaml
    
    ---
    apiVersion: installer.kubesphere.io/v1alpha1
    kind: ClusterConfiguration
    metadata:
      name: ks-installer
      namespace: kubesphere-system
      labels:
        version: v3.3.0
    spec:
      persistence:
        storageClass: ""        # If there is no default StorageClass in your cluster, you need to specify an existing StorageClass here.
      authentication:
        jwtSecret: ""           # Keep the jwtSecret consistent with the Host Cluster. Retrieve the jwtSecret by executing "kubectl -n kubesphere-system get cm kubesphere-config -o yaml | grep -v "apiVersion" | grep jwtSecret" on the Host Cluster.
      local_registry: ""        # Add your private registry address if it is needed.
      # dev_tag: ""               # Add your kubesphere image tag you want to install, by default it's same as ks-installer release version.
      etcd:
        monitoring: false       # Enable or disable etcd monitoring dashboard installation. You have to create a Secret for etcd before you enable it.
        endpointIps: localhost  # etcd cluster EndpointIps. It can be a bunch of IPs here.
        port: 2379              # etcd port.
        tlsEnable: true
      common:
        core:
          console:
            enableMultiLogin: true  # Enable or disable simultaneous logins. It allows different users to log in with the same account at the same time.
            port: 30880
            type: NodePort
        # apiserver:            # Enlarge the apiserver and controller manager's resource requests and limits for the large cluster
        #  resources: {}
        # controllerManager:
        #  resources: {}
        redis:
          enabled: false
          enableHA: false
          volumeSize: 2Gi # Redis PVC size.
        openldap:
          enabled: false
          volumeSize: 2Gi   # openldap PVC size.
        minio:
          volumeSize: 20Gi # Minio PVC size.
        monitoring:
          # type: external   # Whether to specify the external prometheus stack, and need to modify the endpoint at the next line.
          endpoint: http://prometheus-operated.kubesphere-monitoring-system.svc:9090 # Prometheus endpoint to get metrics data.
          GPUMonitoring:     # Enable or disable the GPU-related metrics. If you enable this switch but have no GPU resources, Kubesphere will set it to zero.
            enabled: false
        gpu:                 # Install GPUKinds. The default GPU kind is nvidia.com/gpu. Other GPU kinds can be added here according to your needs.
          kinds:
          - resourceName: "nvidia.com/gpu"
            resourceType: "GPU"
            default: true
        es:   # Storage backend for logging, events and auditing.
          # master:
          #   volumeSize: 4Gi  # The volume size of Elasticsearch master nodes.
          #   replicas: 1      # The total number of master nodes. Even numbers are not allowed.
          #   resources: {}
          # data:
          #   volumeSize: 20Gi  # The volume size of Elasticsearch data nodes.
          #   replicas: 1       # The total number of data nodes.
          #   resources: {}
          logMaxAge: 7             # Log retention time in built-in Elasticsearch. It is 7 days by default.
          elkPrefix: logstash      # The string making up index names. The index name will be formatted as ks--log.
          basicAuth:
            enabled: false
            username: ""
            password: ""
          externalElasticsearchHost: ""
          externalElasticsearchPort: ""
      alerting:                # (CPU: 0.1 Core, Memory: 100 MiB) It enables users to customize alerting policies to send messages to receivers in time with different time intervals and alerting levels to choose from.
        enabled: false         # Enable or disable the KubeSphere Alerting System.
        # thanosruler:
        #   replicas: 1
        #   resources: {}
      auditing:                # Provide a security-relevant chronological set of records,recording the sequence of activities happening on the platform, initiated by different tenants.
        enabled: false         # Enable or disable the KubeSphere Auditing Log System.
        # operator:
        #   resources: {}
        # webhook:
        #   resources: {}
      devops:                  # (CPU: 0.47 Core, Memory: 8.6 G) Provide an out-of-the-box CI/CD system based on Jenkins, and automated workflow tools including Source-to-Image & Binary-to-Image.
        enabled: false             # Enable or disable the KubeSphere DevOps System.
        # resources: {}
        jenkinsMemoryLim: 2Gi      # Jenkins memory limit.
        jenkinsMemoryReq: 1500Mi   # Jenkins memory request.
        jenkinsVolumeSize: 8Gi     # Jenkins volume size.
        jenkinsJavaOpts_Xms: 1200m  # The following three fields are JVM parameters.
        jenkinsJavaOpts_Xmx: 1600m
        jenkinsJavaOpts_MaxRAM: 2g
      events:                  # Provide a graphical web console for Kubernetes Events exporting, filtering and alerting in multi-tenant Kubernetes clusters.
        enabled: false         # Enable or disable the KubeSphere Events System.
        # operator:
        #   resources: {}
        # exporter:
        #   resources: {}
        # ruler:
        #   enabled: true
        #   replicas: 2
        #   resources: {}
      logging:                 # (CPU: 57 m, Memory: 2.76 G) Flexible logging functions are provided for log query, collection and management in a unified console. Additional log collectors can be added, such as Elasticsearch, Kafka and Fluentd.
        enabled: false         # Enable or disable the KubeSphere Logging System.
        logsidecar:
          enabled: true
          replicas: 2
          # resources: {}
      metrics_server:                    # (CPU: 56 m, Memory: 44.35 MiB) It enables HPA (Horizontal Pod Autoscaler).
        enabled: false                   # Enable or disable metrics-server.
      monitoring:
        storageClass: ""                 # If there is an independent StorageClass you need for Prometheus, you can specify it here. The default StorageClass is used by default.
        node_exporter:
          port: 9100
          # resources: {}
        # kube_rbac_proxy:
        #   resources: {}
        # kube_state_metrics:
        #   resources: {}
        # prometheus:
        #   replicas: 1  # Prometheus replicas are responsible for monitoring different segments of data source and providing high availability.
        #   volumeSize: 20Gi  # Prometheus PVC size.
        #   resources: {}
        #   operator:
        #     resources: {}
        # alertmanager:
        #   replicas: 1          # AlertManager Replicas.
        #   resources: {}
        # notification_manager:
        #   resources: {}
        #   operator:
        #     resources: {}
        #   proxy:
        #     resources: {}
        gpu:                           # GPU monitoring-related plug-in installation.
          nvidia_dcgm_exporter:        # Ensure that gpu resources on your hosts can be used normally, otherwise this plug-in will not work properly.
            enabled: false             # Check whether the labels on the GPU hosts contain "nvidia.com/gpu.present=true" to ensure that the DCGM pod is scheduled to these nodes.
            # resources: {}
      multicluster:
        clusterRole: none  # host | member | none  # You can install a solo cluster, or specify it as the Host or Member Cluster.
      network:
        networkpolicy: # Network policies allow network isolation within the same cluster, which means firewalls can be set up between certain instances (Pods).
          # Make sure that the CNI network plugin used by the cluster supports NetworkPolicy. There are a number of CNI network plugins that support NetworkPolicy, including Calico, Cilium, Kube-router, Romana and Weave Net.
          enabled: false # Enable or disable network policies.
        ippool: # Use Pod IP Pools to manage the Pod network address space. Pods to be created can be assigned IP addresses from a Pod IP Pool.
          type: none # Specify "calico" for this field if Calico is used as your CNI plugin. "none" means that Pod IP Pools are disabled.
        topology: # Use Service Topology to view Service-to-Service communication based on Weave Scope.
          type: none # Specify "weave-scope" for this field to enable Service Topology. "none" means that Service Topology is disabled.
      openpitrix: # An App Store that is accessible to all platform tenants. You can use it to manage apps across their entire lifecycle.
        store:
          enabled: false # Enable or disable the KubeSphere App Store.
      servicemesh:         # (0.3 Core, 300 MiB) Provide fine-grained traffic management, observability and tracing, and visualized traffic topology.
        enabled: false     # Base component (pilot). Enable or disable KubeSphere Service Mesh (Istio-based).
        istio:  # Customizing the istio installation configuration, refer to https://istio.io/latest/docs/setup/additional-setup/customize-installation/
          components:
            ingressGateways:
            - name: istio-ingressgateway
              enabled: false
            cni:
              enabled: false
      edgeruntime:          # Add edge nodes to your cluster and deploy workloads on edge nodes.
        enabled: false
        kubeedge:        # kubeedge configurations
          enabled: false
          cloudCore:
            cloudHub:
              advertiseAddress: # At least a public IP address or an IP address which can be accessed by edge nodes must be provided.
                - ""            # Note that once KubeEdge is enabled, CloudCore will malfunction if the address is not provided.
            service:
              cloudhubNodePort: "30000"
              cloudhubQuicNodePort: "30001"
              cloudhubHttpsNodePort: "30002"
              cloudstreamNodePort: "30003"
              tunnelNodePort: "30004"
            # resources: {}
            # hostNetWork: false
          iptables-manager:
            enabled: true 
            mode: "external"
            # resources: {}
          # edgeService:
          #   resources: {}
      gatekeeper:        # Provide admission policy and rule management, A validating (mutating TBA) webhook that enforces CRD-based policies executed by Open Policy Agent.
        enabled: false   # Enable or disable Gatekeeper.
        # controller_manager:
        #   resources: {}
        # audit:
        #   resources: {}
      terminal:
        # image: 'alpine:3.15' # There must be an nsenter program in the image
        timeout: 600         # Container timeout, if set to 0, no timeout will be used. The unit is seconds
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188

    同样是修改这一段:

    servicemesh:
    enabled: true # 将“false”更改为“true”。
    istio: # Customizing the istio installation configuration, refer to https://istio.io/latest/docs/setup/additional-setup/customize-installation/
      components:
        ingressGateways:
        - name: istio-ingressgateway # 将服务暴露至服务网格之外。默认不开启。
          enabled: false
        cni:
          enabled: false # 启用后,会在 Kubernetes pod 生命周期的网络设置阶段完成 Istio 网格的 pod 流量转发设置工作。
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    之后:

    kubectl apply -f https://github.com/kubesphere/ks-installer/releases/download/v3.3.0/kubesphere-installer.yaml
       
    kubectl apply -f cluster-configuration.yaml
    
    • 1
    • 2
    • 3

    K8S会根据描述的yaml文件配置项自动安装KubeSphere,并安装和配置好Istio。

    安装KubeSphere后配置启动

    由于Istio组件是运行在K8S的pod中,故只要有K8S环境就可以启动pod使用Istio。安装好KubeSphere之后,不管KubeSphere是运行在Linux还是K8S的pod,也不管K8S集群是All-in-one还是多节点,在KubeSphere中配置启动Istio都很容易。

    按照笔者的All-in-one环境:https://lizhiyong.blog.csdn.net/article/details/126236516

    先使用管理员登录:

    http://192.168.88.20:30880
    admin
    Aa123456
    
    • 1
    • 2
    • 3

    在这里插入图片描述

    平台管理→集群管理:

    在这里插入图片描述

    在定制资源定义,搜索clusterconf

    在这里插入图片描述

    点进这个ClusterConfiguration之后:

    在这里插入图片描述

    可以编辑yaml

    该yaml目前的内容:

    apiVersion: installer.kubesphere.io/v1alpha1
    kind: ClusterConfiguration
    metadata:
      annotations:
        kubectl.kubernetes.io/last-applied-configuration: >
          {"apiVersion":"installer.kubesphere.io/v1alpha1","kind":"ClusterConfiguration","metadata":{"annotations":{},"labels":{"version":"v3.3.0"},"name":"ks-installer","namespace":"kubesphere-system"},"spec":{"alerting":{"enabled":false},"auditing":{"enabled":false},"authentication":{"jwtSecret":""},"common":{"core":{"console":{"enableMultiLogin":true,"port":30880,"type":"NodePort"}},"es":{"basicAuth":{"enabled":false,"password":"","username":""},"elkPrefix":"logstash","externalElasticsearchHost":"","externalElasticsearchPort":"","logMaxAge":7},"gpu":{"kinds":[{"default":true,"resourceName":"nvidia.com/gpu","resourceType":"GPU"}]},"minio":{"volumeSize":"20Gi"},"monitoring":{"GPUMonitoring":{"enabled":false},"endpoint":"http://prometheus-operated.kubesphere-monitoring-system.svc:9090"},"openldap":{"enabled":false,"volumeSize":"2Gi"},"redis":{"enabled":false,"volumeSize":"2Gi"}},"devops":{"enabled":false,"jenkinsJavaOpts_MaxRAM":"2g","jenkinsJavaOpts_Xms":"1200m","jenkinsJavaOpts_Xmx":"1600m","jenkinsMemoryLim":"2Gi","jenkinsMemoryReq":"1500Mi","jenkinsVolumeSize":"8Gi"},"edgeruntime":{"enabled":false,"kubeedge":{"cloudCore":{"cloudHub":{"advertiseAddress":[""]},"service":{"cloudhubHttpsNodePort":"30002","cloudhubNodePort":"30000","cloudhubQuicNodePort":"30001","cloudstreamNodePort":"30003","tunnelNodePort":"30004"}},"enabled":false,"iptables-manager":{"enabled":true,"mode":"external"}}},"etcd":{"endpointIps":"192.168.88.20","monitoring":false,"port":2379,"tlsEnable":true},"events":{"enabled":false},"logging":{"enabled":false,"logsidecar":{"enabled":true,"replicas":2}},"metrics_server":{"enabled":false},"monitoring":{"gpu":{"nvidia_dcgm_exporter":{"enabled":false}},"node_exporter":{"port":9100},"storageClass":""},"multicluster":{"clusterRole":"none"},"network":{"ippool":{"type":"none"},"networkpolicy":{"enabled":false},"topology":{"type":"none"}},"openpitrix":{"store":{"enabled":false}},"persistence":{"storageClass":""},"servicemesh":{"enabled":false,"istio":{"components":{"cni":{"enabled":false},"ingressGateways":[{"enabled":false,"name":"istio-ingressgateway"}]}}},"terminal":{"timeout":600},"zone":"cn"}}
      labels:
        version: v3.3.0
      name: ks-installer
      namespace: kubesphere-system
    spec:
      alerting:
        enabled: false
      auditing:
        enabled: false
      authentication:
        jwtSecret: ''
      common:
        core:
          console:
            enableMultiLogin: true
            port: 30880
            type: NodePort
        es:
          basicAuth:
            enabled: false
            password: ''
            username: ''
          elkPrefix: logstash
          externalElasticsearchHost: ''
          externalElasticsearchPort: ''
          logMaxAge: 7
        gpu:
          kinds:
            - default: true
              resourceName: nvidia.com/gpu
              resourceType: GPU
        minio:
          volumeSize: 20Gi
        monitoring:
          GPUMonitoring:
            enabled: false
          endpoint: 'http://prometheus-operated.kubesphere-monitoring-system.svc:9090'
        openldap:
          enabled: false
          volumeSize: 2Gi
        redis:
          enabled: false
          volumeSize: 2Gi
      devops:
        enabled: false
        jenkinsJavaOpts_MaxRAM: 2g
        jenkinsJavaOpts_Xms: 1200m
        jenkinsJavaOpts_Xmx: 1600m
        jenkinsMemoryLim: 2Gi
        jenkinsMemoryReq: 1500Mi
        jenkinsVolumeSize: 8Gi
      edgeruntime:
        enabled: false
        kubeedge:
          cloudCore:
            cloudHub:
              advertiseAddress:
                - ''
            service:
              cloudhubHttpsNodePort: '30002'
              cloudhubNodePort: '30000'
              cloudhubQuicNodePort: '30001'
              cloudstreamNodePort: '30003'
              tunnelNodePort: '30004'
          enabled: false
          iptables-manager:
            enabled: true
            mode: external
      etcd:
        endpointIps: 192.168.88.20
        monitoring: false
        port: 2379
        tlsEnable: true
      events:
        enabled: false
      logging:
        enabled: false
        logsidecar:
          enabled: true
          replicas: 2
      metrics_server:
        enabled: false
      monitoring:
        gpu:
          nvidia_dcgm_exporter:
            enabled: false
        node_exporter:
          port: 9100
        storageClass: ''
      multicluster:
        clusterRole: none
      network:
        ippool:
          type: none
        networkpolicy:
          enabled: false
        topology:
          type: none
      openpitrix:
        store:
          enabled: false
      persistence:
        storageClass: ''
      servicemesh:
        enabled: false
        istio:
          components:
            cni:
              enabled: false
            ingressGateways:
              - enabled: false
                name: istio-ingressgateway
      terminal:
        timeout: 600
      zone: cn
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121

    显然按照官网文档,应该将末尾修改为:

    servicemesh:
    enabled: true # 将“false”更改为“true”。
    istio: # Customizing the istio installation configuration, refer to https://istio.io/latest/docs/setup/additional-setup/customize-installation/
      components:
        ingressGateways:
        - name: istio-ingressgateway # 将服务暴露至服务网格之外。默认不开启。
          enabled: false
        cni:
          enabled: false # 启用后,会在 Kubernetes pod 生命周期的网络设置阶段完成 Istio 网格的 pod 流量转发设置工作。
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9

    根据yaml的规范,true前的空格绝对不能少!!!

    确定保存后,即可检查Istio组件的安装过程【Ubuntu20.04需要切换root用户执行】:

    kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l 'app in (ks-install, ks-installer)' -o jsonpath='{.items[0].metadata.name}') -f
    
    • 1

    然后:

    在这里插入图片描述

    喜闻乐见的1核有难,15核点赞。。。有条件还是要上高主频的U,虽然核多也很重要。

    在这里插入图片描述

    Top查看CPU占用情况,发现python3占用了99.7%的CPU。。。

    等一阵子以后:

    Waiting for all tasks to be completed ...
    task network status is successful  (1/5)
    task openpitrix status is successful  (2/5)
    task multicluster status is successful  (3/5)
    task monitoring status is successful  (4/5)
    task servicemesh status is successful  (5/5)
    **************************************************
    Collecting installation results ...
    #####################################################
    ###              Welcome to KubeSphere!           ###
    #####################################################
    
    Console: http://192.168.88.20:30880
    Account: admin
    Password: P@88w0rd
    
    NOTES:
      1. After you log into the console, please check the
         monitoring status of service components in
         "Cluster Management". If any service is not
         ready, please wait patiently until all components
         are up and running.
      2. Please change the default password after login.
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23

    这就代表完成了Istio的安装及初始化。不需要理会这个原始密码。重新登录还是要用已经更改的密码

    验证Istio组件安装情况

    在WebUI可以看到:

    在这里插入图片描述

    系统组件中已经出现了Istio组件。但是点进去发现:

    在这里插入图片描述

    此时不但Istio异常,连带之前正常的Prometheus也一并异常了:

    在这里插入图片描述

    执行:

    root@zhiyong-ksp1:/home/zhiyong# kubectl get pod -n istio-system
    NAME                              READY   STATUS              RESTARTS   AGE
    istiod-1-11-2-54dd699c87-99krn    0/1     ContainerCreating   0          27m
    jaeger-operator-fccc48b86-vtcr8   0/1     ContainerCreating   0          7m10s
    kiali-operator-c459985f7-sttfs    0/1     ContainerCreating   0          7m5s
    
    root@zhiyong-ksp1:/home/zhiyong# kubectl get pod --all-namespaces
    NAMESPACE                      NAME                                                              READY   STATUS              RESTARTS     AGE
    istio-system                   istiod-1-11-2-54dd699c87-99krn                                    0/1     ContainerCreating   0            30m
    istio-system                   jaeger-operator-fccc48b86-vtcr8                                   0/1     ContainerCreating   0            9m53s
    istio-system                   kiali-operator-c459985f7-sttfs                                    0/1     ContainerCreating   0            9m48s
    kube-system                    calico-kube-controllers-f9f9bbcc9-2v7lm                           1/1     Running             1 (8d ago)   8d
    kube-system                    calico-node-4mgc7                                                 1/1     Running             1 (8d ago)   8d
    kube-system                    coredns-f657fccfd-2gw7h                                           1/1     Running             1 (8d ago)   8d
    kube-system                    coredns-f657fccfd-pflwf                                           1/1     Running             1 (8d ago)   8d
    kube-system                    kube-apiserver-zhiyong-ksp1                                       1/1     Running             1 (8d ago)   8d
    kube-system                    kube-controller-manager-zhiyong-ksp1                              1/1     Running             1 (8d ago)   8d
    kube-system                    kube-proxy-cn68l                                                  1/1     Running             1 (8d ago)   8d
    kube-system                    kube-scheduler-zhiyong-ksp1                                       1/1     Running             1 (8d ago)   8d
    kube-system                    nodelocaldns-96gtw                                                1/1     Running             1 (8d ago)   8d
    kube-system                    openebs-localpv-provisioner-68db4d895d-p9527                      1/1     Running             0            8d
    kube-system                    snapshot-controller-0                                             1/1     Running             1 (8d ago)   8d
    kubesphere-controls-system     default-http-backend-587748d6b4-ccg59                             1/1     Running             1 (8d ago)   8d
    kubesphere-controls-system     kubectl-admin-5d588c455b-82cnk                                    1/1     Running             1 (8d ago)   8d
    kubesphere-logging-system      elasticsearch-logging-curator-elasticsearch-curator-2767784rhhk   0/1     ContainerCreating   0            15m
    kubesphere-logging-system      elasticsearch-logging-data-0                                      0/1     Pending             0            32m
    kubesphere-logging-system      elasticsearch-logging-discovery-0                                 0/1     Pending             0            32m
    kubesphere-monitoring-system   alertmanager-main-0                                               2/2     Running             2 (8d ago)   8d
    kubesphere-monitoring-system   kube-state-metrics-6d6786b44-bbb4f                                3/3     Running             3 (8d ago)   8d
    kubesphere-monitoring-system   node-exporter-8sz74                                               2/2     Running             2 (8d ago)   8d
    kubesphere-monitoring-system   notification-manager-deployment-6f8c66ff88-pt4l8                  2/2     Running             2 (8d ago)   8d
    kubesphere-monitoring-system   notification-manager-operator-6455b45546-nkmx8                    2/2     Running             2 (8d ago)   8d
    kubesphere-monitoring-system   prometheus-k8s-0                                                  0/2     Terminating         0            8d
    kubesphere-monitoring-system   prometheus-operator-66d997dccf-c968c                              2/2     Running             2 (8d ago)   8d
    kubesphere-system              ks-apiserver-6b9bcb86f4-hsdzs                                     1/1     Running             1 (8d ago)   8d
    kubesphere-system              ks-console-599c49d8f6-ngb6b                                       1/1     Running             1 (8d ago)   8d
    kubesphere-system              ks-controller-manager-66747fcddc-r7cpt                            1/1     Running             1 (8d ago)   8d
    kubesphere-system              ks-installer-5fd8bd46b8-dzhbb                                     1/1     Running             1 (8d ago)   8d
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38

    耐心等一会儿。。。

    在这里插入图片描述

    从KubeSphere的web UI监控可以看出目前状态还是容器创建中。但是一直这样也不合适。。。

    解决ContainerCreating

    查看日志

    root@zhiyong-ksp1:/home/zhiyong# kubectl describe pod istiod-1-11-2-54dd699c87-99krn -n istio-system
    Name:           istiod-1-11-2-54dd699c87-99krn
    Namespace:      istio-system
    Priority:       0
    Node:           zhiyong-ksp1/192.168.88.20
    Start Time:     Wed, 17 Aug 2022 00:44:55 +0800
    Labels:         app=istiod
                    install.operator.istio.io/owning-resource=unknown
                    istio=istiod
                    istio.io/rev=1-11-2
                    operator.istio.io/component=Pilot
                    pod-template-hash=54dd699c87
                    sidecar.istio.io/inject=false
    Annotations:    prometheus.io/port: 15014
                    prometheus.io/scrape: true
                    sidecar.istio.io/inject: false
    Status:         Pending
    IP:
    IPs:            <none>
    Controlled By:  ReplicaSet/istiod-1-11-2-54dd699c87
    Containers:
      discovery:
        Container ID:
        Image:         registry.cn-beijing.aliyuncs.com/kubesphereio/pilot:1.11.1
        Image ID:
        Ports:         8080/TCP, 15010/TCP, 15017/TCP
        Host Ports:    0/TCP, 0/TCP, 0/TCP
        Args:
          discovery
          --monitoringAddr=:15014
          --log_output_level=default:info
          --domain
          cluster.local
          --keepaliveMaxServerConnectionAge
          30m
        State:          Waiting
          Reason:       ContainerCreating
        Ready:          False
        Restart Count:  0
        Requests:
          cpu:      500m
          memory:   2Gi
        Readiness:  http-get http://:8080/ready delay=1s timeout=5s period=3s #success=1 #failure=3
        Environment:
          REVISION:                                     1-11-2
          JWT_POLICY:                                   first-party-jwt
          PILOT_CERT_PROVIDER:                          istiod
          POD_NAME:                                     istiod-1-11-2-54dd699c87-99krn (v1:metadata.name)
          POD_NAMESPACE:                                istio-system (v1:metadata.namespace)
          SERVICE_ACCOUNT:                               (v1:spec.serviceAccountName)
          KUBECONFIG:                                   /var/run/secrets/remote/config
          ENABLE_LEGACY_FSGROUP_INJECTION:              false
          PILOT_TRACE_SAMPLING:                         1
          PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_OUTBOUND:  true
          PILOT_ENABLE_PROTOCOL_SNIFFING_FOR_INBOUND:   true
          ISTIOD_ADDR:                                  istiod-1-11-2.istio-system.svc:15012
          PILOT_ENABLE_ANALYSIS:                        false
          CLUSTER_ID:                                   Kubernetes
        Mounts:
          /etc/cacerts from cacerts (ro)
          /var/run/secrets/istio-dns from local-certs (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-l54jm (ro)
          /var/run/secrets/remote from istio-kubeconfig (ro)
    Conditions:
      Type              Status
      Initialized       True
      Ready             False
      ContainersReady   False
      PodScheduled      True
    Volumes:
      local-certs:
        Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
        Medium:     Memory
        SizeLimit:  <unset>
      cacerts:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  cacerts
        Optional:    true
      istio-kubeconfig:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  istio-kubeconfig
        Optional:    true
      kube-api-access-l54jm:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   Burstable
    Node-Selectors:              <none>
    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type     Reason                  Age                   From               Message
      ----     ------                  ----                  ----               -------
      Normal   Scheduled               43m                   default-scheduler  Successfully assigned istio-system/istiod-1-11-2-54dd699c87-99krn to zhiyong-ksp1
      Warning  FailedCreatePodSandBox  43m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5d0a3bdb6dea937aa5b118bbd00305a1542111c97af84a3cbdd8f188b1681687": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  43m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "ff84de82acfd944be7f3804c96f39ab976ae4d6810b7e0364c90560a4b4070e7": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  42m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "6337bea6f7c16cd9adcff0d2b75238beb4365dc4b880d4c8e4f4535885d59d30": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  42m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "42e08603d4d7e7d1713eecbb21af258022e3fb50c6f5611808b3e2755d50d980": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  42m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "51a6b5b8ea5a63f4be828a0c855802e42640324c440fcc3487c535123d7b3372": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  42m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "dada948b2a416a0ec925b7f67a101b8fd48fdad9fb20d6c41eaf1bbad0a18e57": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  41m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "df3487e020c1e7eb527cc0fce1fe990873bd20f46cbf04de99005e0da5896abe": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  41m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "92e739549a96aa03ea864188abc1b91c9a45394dae28ad97234fa1caf4d52240": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  41m                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "bc5d1999a2d5ad4d7cf5c1e1c3c7c1a80dee02b806d0be2e15c326e2d82f4af5": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  3m2s (x176 over 41m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "be41a317c2e14b4096f2f8f0d4bfaa8a80572f7365ab3d92c20be75fe97304f4": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106

    显然出现了网络没有认证通过的问题。

    再次查看Prometheus的日志:

    root@zhiyong-ksp1:/home/zhiyong# kubectl describe pod prometheus-k8s-0 -n kubesphere-monitoring-system
    Name:                      prometheus-k8s-0
    Namespace:                 kubesphere-monitoring-system
    Priority:                  0
    Node:                      zhiyong-ksp1/192.168.88.20
    Start Time:                Mon, 08 Aug 2022 20:42:21 +0800
    Labels:                    app.kubernetes.io/component=prometheus
                               app.kubernetes.io/instance=k8s
                               app.kubernetes.io/managed-by=prometheus-operator
                               app.kubernetes.io/name=prometheus
                               app.kubernetes.io/part-of=kube-prometheus
                               app.kubernetes.io/version=2.34.0
                               controller-revision-hash=prometheus-k8s-557cc865c4
                               operator.prometheus.io/name=k8s
                               operator.prometheus.io/shard=0
                               prometheus=k8s
                               statefulset.kubernetes.io/pod-name=prometheus-k8s-0
    Annotations:               cni.projectcalico.org/containerID: 1d4064f425cad8043d3b38e60155e778e9a1390bc2486b76ac29ad14fb589b40
                               cni.projectcalico.org/podIP: 10.233.107.36/32
                               cni.projectcalico.org/podIPs: 10.233.107.36/32
                               kubectl.kubernetes.io/default-container: prometheus
    Status:                    Terminating (lasts 41m)
    Termination Grace Period:  600s
    IP:                        10.233.107.36
    IPs:
      IP:           10.233.107.36
    Controlled By:  StatefulSet/prometheus-k8s
    Init Containers:
      init-config-reloader:
        Container ID:  containerd://f29630d87dccf60dc8bd065f53ad5187d2f7600a35500a4fa4bfd71a2118daa6
        Image:         registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus-config-reloader:v0.55.1
        Image ID:      registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus-config-reloader@sha256:7743c7ef48f9c0ae6f5c0de4b26e7ff6ae9ece4917a4e139acb21a0d8e77aa3c
        Port:          8080/TCP
        Host Port:     0/TCP
        Command:
          /bin/prometheus-config-reloader
        Args:
          --watch-interval=0
          --listen-address=:8080
          --config-file=/etc/prometheus/config/prometheus.yaml.gz
          --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
          --watched-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
        State:          Terminated
          Reason:       Completed
          Exit Code:    0
          Started:      Mon, 08 Aug 2022 20:42:21 +0800
          Finished:     Mon, 08 Aug 2022 20:42:22 +0800
        Ready:          True
        Restart Count:  0
        Limits:
          cpu:     100m
          memory:  50Mi
        Requests:
          cpu:     100m
          memory:  50Mi
        Environment:
          POD_NAME:  prometheus-k8s-0 (v1:metadata.name)
          SHARD:     0
        Mounts:
          /etc/prometheus/config from config (rw)
          /etc/prometheus/config_out from config-out (rw)
          /etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vcb4c (ro)
    Containers:
      prometheus:
        Container ID:  containerd://2b913fb7dadcc7342759437d2068d0a9cbdcd96fadbb567c0ca5212ca72fb372
        Image:         registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus:v2.34.0
        Image ID:      registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus@sha256:b37103e03399e90c9b7b1b2940894d3634915cf9df4aa2e5402bd85b4377808c
        Port:          9090/TCP
        Host Port:     0/TCP
        Args:
          --web.console.templates=/etc/prometheus/consoles
          --web.console.libraries=/etc/prometheus/console_libraries
          --storage.tsdb.retention.time=7d
          --config.file=/etc/prometheus/config_out/prometheus.env.yaml
          --storage.tsdb.path=/prometheus
          --web.enable-lifecycle
          --query.max-concurrency=1000
          --web.route-prefix=/
          --web.config.file=/etc/prometheus/web_config/web-config.yaml
        State:          Terminated
          Reason:       Completed
          Exit Code:    0
          Started:      Mon, 08 Aug 2022 20:42:51 +0800
          Finished:     Wed, 17 Aug 2022 00:45:37 +0800
        Ready:          False
        Restart Count:  0
        Limits:
          cpu:     4
          memory:  16Gi
        Requests:
          cpu:        200m
          memory:     400Mi
        Liveness:     http-get http://:web/-/healthy delay=0s timeout=3s period=5s #success=1 #failure=6
        Readiness:    http-get http://:web/-/ready delay=0s timeout=3s period=5s #success=1 #failure=3
        Startup:      http-get http://:web/-/ready delay=0s timeout=3s period=15s #success=1 #failure=60
        Environment:  <none>
        Mounts:
          /etc/prometheus/certs from tls-assets (ro)
          /etc/prometheus/config_out from config-out (ro)
          /etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
          /etc/prometheus/web_config/web-config.yaml from web-config (ro,path="web-config.yaml")
          /prometheus from prometheus-k8s-db (rw,path="prometheus-db")
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vcb4c (ro)
      config-reloader:
        Container ID:  containerd://215303f25ece01ad28e56a8d94c19b00cbd9429d10cddc1b1db9981802e74011
        Image:         registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus-config-reloader:v0.55.1
        Image ID:      registry.cn-beijing.aliyuncs.com/kubesphereio/prometheus-config-reloader@sha256:7743c7ef48f9c0ae6f5c0de4b26e7ff6ae9ece4917a4e139acb21a0d8e77aa3c
        Port:          8080/TCP
        Host Port:     0/TCP
        Command:
          /bin/prometheus-config-reloader
        Args:
          --listen-address=:8080
          --reload-url=http://localhost:9090/-/reload
          --config-file=/etc/prometheus/config/prometheus.yaml.gz
          --config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
          --watched-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
        State:      Terminated
          Reason:   Error
          Message:  level=info ts=2022-08-08T12:42:51.99954274Z caller=main.go:111 msg="Starting prometheus-config-reloader" version="(version=0.55.1, branch=refs/tags/v0.55.1, revision=08c846115c67195bc821018168040db6f3e236e3)"
    level=info ts=2022-08-08T12:42:51.999646088Z caller=main.go:112 build_context="(go=go1.17.7, user=Action-Run-ID-2045821452, date=20220326-21:47:32)"
    level=info ts=2022-08-08T12:42:52.093230589Z caller=main.go:149 msg="Starting web server for metrics" listen=:8080
    level=info ts=2022-08-08T12:42:52.195172719Z caller=reloader.go:373 msg="Reload triggered" cfg_in=/etc/prometheus/config/prometheus.yaml.gz cfg_out=/etc/prometheus/config_out/prometheus.env.yaml watched_dirs=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
    level=info ts=2022-08-08T12:42:52.195306486Z caller=reloader.go:235 msg="started watching config file and directories for changes" cfg=/etc/prometheus/config/prometheus.yaml.gz out=/etc/prometheus/config_out/prometheus.env.yaml dirs=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
    
          Exit Code:    2
          Started:      Mon, 08 Aug 2022 20:42:51 +0800
          Finished:     Wed, 17 Aug 2022 00:45:36 +0800
        Ready:          False
        Restart Count:  0
        Limits:
          cpu:     100m
          memory:  50Mi
        Requests:
          cpu:     100m
          memory:  50Mi
        Environment:
          POD_NAME:  prometheus-k8s-0 (v1:metadata.name)
          SHARD:     0
        Mounts:
          /etc/prometheus/config from config (rw)
          /etc/prometheus/config_out from config-out (rw)
          /etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vcb4c (ro)
    Conditions:
      Type              Status
      Initialized       True
      Ready             False
      ContainersReady   False
      PodScheduled      True
    Volumes:
      prometheus-k8s-db:
        Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
        ClaimName:  prometheus-k8s-db-prometheus-k8s-0
        ReadOnly:   false
      config:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  prometheus-k8s
        Optional:    false
      tls-assets:
        Type:                Projected (a volume that contains injected data from multiple sources)
        SecretName:          prometheus-k8s-tls-assets-0
        SecretOptionalName:  <nil>
      config-out:
        Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
        Medium:
        SizeLimit:  <unset>
      prometheus-k8s-rulefiles-0:
        Type:      ConfigMap (a volume populated by a ConfigMap)
        Name:      prometheus-k8s-rulefiles-0
        Optional:  false
      web-config:
        Type:        Secret (a volume populated by a Secret)
        SecretName:  prometheus-k8s-web-config
        Optional:    false
      kube-api-access-vcb4c:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   Burstable
    Node-Selectors:              kubernetes.io/os=linux
    Tolerations:                 dedicated=monitoring:NoSchedule
                                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type     Reason         Age                  From     Message
      ----     ------         ----                 ----     -------
      Normal   Killing        51m                  kubelet  Stopping container prometheus
      Normal   Killing        51m                  kubelet  Stopping container config-reloader
      Warning  FailedKillPod  63s (x231 over 51m)  kubelet  error killing pod: failed to "KillPodSandbox" for "35e28d63-59c1-4860-a9bc-924123478928" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"1d4064f425cad8043d3b38e60155e778e9a1390bc2486b76ac29ad14fb589b40\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: connection is unauthorized: Unauthorized"
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 106
    • 107
    • 108
    • 109
    • 110
    • 111
    • 112
    • 113
    • 114
    • 115
    • 116
    • 117
    • 118
    • 119
    • 120
    • 121
    • 122
    • 123
    • 124
    • 125
    • 126
    • 127
    • 128
    • 129
    • 130
    • 131
    • 132
    • 133
    • 134
    • 135
    • 136
    • 137
    • 138
    • 139
    • 140
    • 141
    • 142
    • 143
    • 144
    • 145
    • 146
    • 147
    • 148
    • 149
    • 150
    • 151
    • 152
    • 153
    • 154
    • 155
    • 156
    • 157
    • 158
    • 159
    • 160
    • 161
    • 162
    • 163
    • 164
    • 165
    • 166
    • 167
    • 168
    • 169
    • 170
    • 171
    • 172
    • 173
    • 174
    • 175
    • 176
    • 177
    • 178
    • 179
    • 180
    • 181
    • 182
    • 183
    • 184
    • 185
    • 186
    • 187
    • 188
    • 189
    • 190
    • 191
    • 192
    • 193

    定位问题

    根据报错日志,基本上确定是Calico的问题。

    root@zhiyong-ksp1:/etc/cni/net.d# pwd
    /etc/cni/net.d
    root@zhiyong-ksp1:/etc/cni/net.d# ll
    总用量 16
    drwxr-xr-x 2 kube root 4096 88 10:05 ./
    drwxr-xr-x 3 kube root 4096 88 10:02 ../
    -rw-r--r-- 1 root root  663 88 19:23 10-calico.conflist
    -rw------- 1 root root 2713 88 20:34 calico-kubeconfig
    
    root@zhiyong-ksp1:/etc/cni/net.d# cat 10-calico.conflist
    {
      "name": "k8s-pod-network",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "log_level": "info",
          "log_file_path": "/var/log/calico/cni/cni.log",
          "datastore_type": "kubernetes",
          "nodename": "zhiyong-ksp1",
          "mtu": 0,
          "ipam": {
              "type": "calico-ipam"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        },
        {
          "type": "bandwidth",
          "capabilities": {"bandwidth": true}
        }
      ]
    
    root@zhiyong-ksp1:/etc/cni/net.d# cat 10-calico.conflist
    {
      "name": "k8s-pod-network",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "log_level": "info",
          "log_file_path": "/var/log/calico/cni/cni.log",
          "datastore_type": "kubernetes",
          "nodename": "zhiyong-ksp1",
          "mtu": 0,
          "ipam": {
              "type": "calico-ipam"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        },
        {
          "type": "bandwidth",
          "capabilities": {"bandwidth": true}
        }
      ]
    }root@zhiyong-ksp1:/etc/cni/net.d# cat calico-kubeconfig
    # Kubeconfig file for Calico CNI plugin. Installed by calico/node.
    apiVersion: v1
    kind: Config
    clusters:
    - name: local
      cluster:
        server: https://10.233.0.1:443
        certificate-authority-data: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUMvakNDQWVhZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJeU1EZ3dPREF5TURRek5Wb1hEVE15TURnd05UQXlNRFF6TlZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTHM5ClcxTkxMWGNHNlhIdzZ0VEVyV1pWTXdlUUdTV2IzU3UrMTN0V2REcUlhcm16YW1BWGNNbnlValRoNWhQdFZVVjcKNVdjYldXcFh3VTNOaWhpSXRmOXhoZ2tsMy9KVElycFBSdlRBc3VUVUo1RW9yb3BNLzNpRWpBZUc0d0RNQURtYwpKNHArSjlJSzZWekV4UUI3VTA2L1F6eWhRT3RQQS83dFlhbjM2dFE3eFRJYmJvQ3AvQXRSNHdqOXBBRHVSV1M2CnQ0ZlFZMUh4NHpaS1pmeEpBaXF5MXl5Ylg0ckxSektYMzJ0MXlsYk9ET21kWjZXVjJLZEgzYjV2V3ZrZThzQy8KcHhMT0JvRmRVdU0ra3hkUHgxMitHaVVtbUM0NDFEdU02MVZiQ0o0NlJ4QVlDenY4bmxoQUhrTDMrL3JQZ0U1dgpaYTZuSVoxdWVabFBRRXRqL3FFQ0F3RUFBYU5aTUZjd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZDMkk4MldLNEJjSWpieEQvVjl4U0VnblNhc1pNQlVHQTFVZEVRUU8KTUF5Q0NtdDFZbVZ5Ym1WMFpYTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBRWlLbklrendTaXpKL0ZhRmd4SQpPRlNoaTNTQ0NaNHNLVXliZVhkZkIwV3FLRHpialBteEZ3LzQ0SFMwUUhaNU5TVGp6WGtHQ1kyTlpDRTE3dldWCmtDYjFVM1czQmdaM05CSmZtV29sTEJQTCtnSkovYlRuRVJUTVY4MDYrTWN6d1RBeEhWcllXcU5BT2o5R3pEdFMKc3FwVWxQZDc1MDdhZmluRmZMVFpORnF4SDV4Y0VTUDNETVF1L21GUXNxMnYyeW9XTXY4dHluVGs2V3VSa0xVQgoxd1JXdUNSeXF1OCs3dEVzMHlCNklTODF0cDBGMHZPekpoakw4bTBxQWhLbUNKUFlGTUFZRFMvNXJuZDBCb3NLClhabHlyUUxtV0ZLRDRWL2Z3T0Vua1hMS3R3VnkrdlFJYXVEWjZTaVM1ODcxMURmdlhTTWFCU1lkL0hwZW1OYmQKVFBFPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg=="
    users:
    - name: calico
      user:
        token: eyJhbGciOiJSUzI1NiIsImtpZCI6IkNRb0VCZDRGY21PQjBSYktnYzVuSkV6UVVVY0VvOE1Jd0NCOFRYbEQ5XzQifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNjYwMDQ4NDYxLCJpYXQiOjE2NTk5NjIwNjEsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJjYWxpY28tbm9kZSIsInVpZCI6IjFhNDk4MWY1LWVmMWQtNDk5OC05YTA1LTk4OGU0MmMyN2Q4OCJ9fSwibmJmIjoxNjU5OTYyMDYxLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZS1zeXN0ZW06Y2FsaWNvLW5vZGUifQ.Qa0KSAJGgNSA9lvND2Ivf9qxZsieI2r1FwCGvwzvXw_d4Nrw5WSygK-9t6tJKCnsXgCQSXijRBFPqiamJYZUx1dhgbPQp8KZF1seqtafCLRNnPS1TUrYJO_SRrp37UizmQzdOQOh7m_SGktcqdViZAyIGapjeMc7P8gU3v1HA93SflnR1keUo5rbXJjpaj2b6F0SBUCVyQnuORopD9cdCH-jIunyp4y_GhOtutV71ZmxcZeCdDqaBAE5OTnIwGYwz5yZqCOJZGqRxI74EX1B06iFgOQs8yksFiEpp5JdFUaCWNnxAeYo5cpH72l2XzF7rb7A2Ob0Rk96wJSSEMJq8g
    contexts:
    - name: calico-context
      context:
        cluster: local
        user: calico
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92

    移走配置文件

    参照StackOverflow的这篇:https://stackoverflow.com/questions/61672804/after-uninstalling-calico-new-pods-are-stuck-in-container-creating-state

    以及K8S官网的这篇:https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/

    需要删除这2个配置文件。笔者直接mv移走备份。

    root@zhiyong-ksp1:/home/zhiyong# mkdir -p /fileback/20220817
    current-context: calico-contextroot@zhiyong-ksp1:/etc/cni/net.d# ll
    总用量 16
    drwxr-xr-x 2 kube root 4096 88 10:05 ./
    drwxr-xr-x 3 kube root 4096 88 10:02 ../
    -rw-r--r-- 1 root root  663 88 19:23 10-calico.conflist
    -rw------- 1 root root 2713 88 20:34 calico-kubeconfig
    root@zhiyong-ksp1:/etc/cni/net.d# mv ./10-calico.conflist /fileback/20220817
    root@zhiyong-ksp1:/etc/cni/net.d# ll
    总用量 12
    drwxr-xr-x 2 kube root 4096 817 01:43 ./
    drwxr-xr-x 3 kube root 4096 88 10:02 ../
    -rw------- 1 root root 2713 88 20:34 calico-kubeconfig
    root@zhiyong-ksp1:/etc/cni/net.d# mv ./calico-kubeconfig /fileback/20220817
    root@zhiyong-ksp1:/etc/cni/net.d# ll
    总用量 8
    drwxr-xr-x 2 kube root 4096 817 01:43 ./
    drwxr-xr-x 3 kube root 4096 88 10:02 ../
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18

    reboot重启机器

    重启的目的是刷新Calico的配置

    重启后

    root@zhiyong-ksp1:/home/zhiyong# kubectl get pod --all-namespaces
    NAMESPACE                      NAME                                                              READY   STATUS              RESTARTS        AGE
    istio-system                   istiod-1-11-2-54dd699c87-99krn                                    1/1     Running             0               65m
    istio-system                   jaeger-operator-fccc48b86-vtcr8                                   0/1     ContainerCreating   0               44m
    istio-system                   kiali-75c777bdf6-xhbq7                                            0/1     ContainerCreating   0               12s
    istio-system                   kiali-operator-c459985f7-sttfs                                    1/1     Running             0               44m
    kube-system                    calico-kube-controllers-f9f9bbcc9-2v7lm                           1/1     Running             2 (2m54s ago)   8d
    kube-system                    calico-node-4mgc7                                                 1/1     Running             2 (2m54s ago)   8d
    kube-system                    coredns-f657fccfd-2gw7h                                           1/1     Running             2 (2m54s ago)   8d
    kube-system                    coredns-f657fccfd-pflwf                                           1/1     Running             2 (2m54s ago)   8d
    kube-system                    kube-apiserver-zhiyong-ksp1                                       1/1     Running             2 (2m54s ago)   8d
    kube-system                    kube-controller-manager-zhiyong-ksp1                              1/1     Running             2 (2m54s ago)   8d
    kube-system                    kube-proxy-cn68l                                                  1/1     Running             2 (2m54s ago)   8d
    kube-system                    kube-scheduler-zhiyong-ksp1                                       1/1     Running             2 (2m54s ago)   8d
    kube-system                    nodelocaldns-96gtw                                                1/1     Running             2 (2m54s ago)   8d
    kube-system                    openebs-localpv-provisioner-68db4d895d-p9527                      1/1     Running             1 (2m54s ago)   8d
    kube-system                    snapshot-controller-0                                             1/1     Running             2 (2m54s ago)   8d
    kubesphere-controls-system     default-http-backend-587748d6b4-ccg59                             1/1     Running             2 (2m54s ago)   8d
    kubesphere-controls-system     kubectl-admin-5d588c455b-82cnk                                    1/1     Running             2 (2m54s ago)   8d
    kubesphere-logging-system      elasticsearch-logging-curator-elasticsearch-curator-2767784rhhk   0/1     ContainerCreating   0               50m
    kubesphere-logging-system      elasticsearch-logging-data-0                                      0/1     Pending             0               67m
    kubesphere-logging-system      elasticsearch-logging-discovery-0                                 0/1     Pending             0               67m
    kubesphere-monitoring-system   alertmanager-main-0                                               2/2     Running             4 (2m54s ago)   8d
    kubesphere-monitoring-system   kube-state-metrics-6d6786b44-bbb4f                                3/3     Running             6 (2m54s ago)   8d
    kubesphere-monitoring-system   node-exporter-8sz74                                               2/2     Running             4 (2m54s ago)   8d
    kubesphere-monitoring-system   notification-manager-deployment-6f8c66ff88-pt4l8                  2/2     Running             4 (2m54s ago)   8d
    kubesphere-monitoring-system   notification-manager-operator-6455b45546-nkmx8                    2/2     Running             4 (2m54s ago)   8d
    kubesphere-monitoring-system   prometheus-k8s-0                                                  2/2     Running             0               2m5s
    kubesphere-monitoring-system   prometheus-operator-66d997dccf-c968c                              2/2     Running             4 (2m54s ago)   8d
    kubesphere-system              ks-apiserver-6b9bcb86f4-hsdzs                                     0/1     Unknown             1               8d
    kubesphere-system              ks-console-599c49d8f6-ngb6b                                       1/1     Running             2 (2m54s ago)   8d
    kubesphere-system              ks-controller-manager-66747fcddc-r7cpt                            0/1     Unknown             1               8d
    kubesphere-system              ks-installer-5fd8bd46b8-dzhbb                                     1/1     Running             2 (2m54s ago)   8d
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33

    可以看到reboot后,由于刷新了Calico的网络配置,之前失败的Pod现在状态看起来比较正常。

    kubesphere-logging-system      elasticsearch-logging-data-0                                      0/1     Init:1/2            0               69m
    kubesphere-logging-system      elasticsearch-logging-discovery-0                                 0/1     Init:1/2            0               69m
    
    • 1
    • 2

    并且这2个pod还在初始化。

    此时还有一些Java进程在占用CPU:

    在这里插入图片描述

    多等一会儿:

    root@zhiyong-ksp1:/home/zhiyong# kubectl get pod --all-namespaces
    NAMESPACE                      NAME                                                              READY   STATUS      RESTARTS        AGE
    istio-system                   istiod-1-11-2-54dd699c87-99krn                                    1/1     Running     0               72m
    istio-system                   jaeger-collector-67cfc55477-7757f                                 1/1     Running     5 (3m41s ago)   6m58s
    istio-system                   jaeger-operator-fccc48b86-vtcr8                                   1/1     Running     0               52m
    istio-system                   jaeger-query-8497bdbfd7-csbts                                     2/2     Running     0               102s
    istio-system                   kiali-75c777bdf6-xhbq7                                            1/1     Running     0               7m37s
    istio-system                   kiali-operator-c459985f7-sttfs                                    1/1     Running     0               52m
    kube-system                    calico-kube-controllers-f9f9bbcc9-2v7lm                           1/1     Running     2 (10m ago)     8d
    kube-system                    calico-node-4mgc7                                                 1/1     Running     2 (10m ago)     8d
    kube-system                    coredns-f657fccfd-2gw7h                                           1/1     Running     2 (10m ago)     8d
    kube-system                    coredns-f657fccfd-pflwf                                           1/1     Running     2 (10m ago)     8d
    kube-system                    kube-apiserver-zhiyong-ksp1                                       1/1     Running     2 (10m ago)     8d
    kube-system                    kube-controller-manager-zhiyong-ksp1                              1/1     Running     2 (10m ago)     8d
    kube-system                    kube-proxy-cn68l                                                  1/1     Running     2 (10m ago)     8d
    kube-system                    kube-scheduler-zhiyong-ksp1                                       1/1     Running     2 (10m ago)     8d
    kube-system                    nodelocaldns-96gtw                                                1/1     Running     2 (10m ago)     8d
    kube-system                    openebs-localpv-provisioner-68db4d895d-p9527                      1/1     Running     1 (10m ago)     8d
    kube-system                    snapshot-controller-0                                             1/1     Running     2 (10m ago)     8d
    kubesphere-controls-system     default-http-backend-587748d6b4-ccg59                             1/1     Running     2 (10m ago)     8d
    kubesphere-controls-system     kubectl-admin-5d588c455b-82cnk                                    1/1     Running     2 (10m ago)     8d
    kubesphere-logging-system      elasticsearch-logging-curator-elasticsearch-curator-2767784rhhk   0/1     Completed   0               57m
    kubesphere-logging-system      elasticsearch-logging-data-0                                      1/1     Running     0               74m
    kubesphere-logging-system      elasticsearch-logging-discovery-0                                 1/1     Running     0               74m
    kubesphere-monitoring-system   alertmanager-main-0                                               2/2     Running     4 (10m ago)     8d
    kubesphere-monitoring-system   kube-state-metrics-6d6786b44-bbb4f                                3/3     Running     6 (10m ago)     8d
    kubesphere-monitoring-system   node-exporter-8sz74                                               2/2     Running     4 (10m ago)     8d
    kubesphere-monitoring-system   notification-manager-deployment-6f8c66ff88-pt4l8                  2/2     Running     4 (10m ago)     8d
    kubesphere-monitoring-system   notification-manager-operator-6455b45546-nkmx8                    2/2     Running     4 (10m ago)     8d
    kubesphere-monitoring-system   prometheus-k8s-0                                                  2/2     Running     0               9m30s
    kubesphere-monitoring-system   prometheus-operator-66d997dccf-c968c                              2/2     Running     4 (10m ago)     8d
    kubesphere-system              ks-apiserver-6b9bcb86f4-hsdzs                                     1/1     Running     2 (10m ago)     8d
    kubesphere-system              ks-console-599c49d8f6-ngb6b                                       1/1     Running     2 (10m ago)     8d
    kubesphere-system              ks-controller-manager-66747fcddc-r7cpt                            1/1     Running     2 (10m ago)     8d
    kubesphere-system              ks-installer-5fd8bd46b8-dzhbb                                     1/1     Running     2 (10m ago)     8d
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35

    发现除了elasticsearch-logging-curator-elasticsearch-curator-2767784rhhk这个pod是completed,其余都Running

    在这里插入图片描述

    web UI中也可以看到已经全绿,没有报错。显然CalicoIstioPrometheus的pod已经全部修复完毕。

    检查completed的pod状态

    root@zhiyong-ksp1:/home/zhiyong# kubectl describe pod elasticsearch-logging-curator-elasticsearch-curator-2767784rhhk -n kubesphere-logging-system
    Name:         elasticsearch-logging-curator-elasticsearch-curator-2767784rhhk
    Namespace:    kubesphere-logging-system
    Priority:     0
    Node:         zhiyong-ksp1/192.168.88.20
    Start Time:   Wed, 17 Aug 2022 01:00:00 +0800
    Labels:       app=elasticsearch-curator
                  controller-uid=d95b480d-abb9-42ed-9c1e-873127f96dc1
                  job-name=elasticsearch-logging-curator-elasticsearch-curator-27677820
                  release=elasticsearch-logging-curator
    Annotations:  cni.projectcalico.org/containerID: 584387ef1390db6f2d17ee0e2bc92951178cdb373c34544ecf150151253f4766
                  cni.projectcalico.org/podIP:
                  cni.projectcalico.org/podIPs:
    Status:       Succeeded
    IP:           10.233.107.51
    IPs:
      IP:           10.233.107.51
    Controlled By:  Job/elasticsearch-logging-curator-elasticsearch-curator-27677820
    Containers:
      elasticsearch-curator:
        Container ID:  containerd://a2b7da0a34df9601acc062b10691dbbfad5bc22a838e18d9b95f3bd57633479e
        Image:         registry.cn-beijing.aliyuncs.com/kubesphereio/elasticsearch-curator:v5.7.6
        Image ID:      registry.cn-beijing.aliyuncs.com/kubesphereio/elasticsearch-curator@sha256:0fdc68b2a211f753238f9d54734b331141a9ade5bf31eef801ea0d056c9ab1c1
        Port:          <none>
        Host Port:     <none>
        Command:
          curator/curator
        Args:
          --config
          /etc/es-curator/config.yml
          /etc/es-curator/action_file.yml
        State:          Terminated
          Reason:       Completed
          Exit Code:    0
          Started:      Wed, 17 Aug 2022 01:51:12 +0800
          Finished:     Wed, 17 Aug 2022 01:51:12 +0800
        Ready:          False
        Restart Count:  0
        Environment:    <none>
        Mounts:
          /etc/es-curator from config-volume (rw)
          /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kvk6g (ro)
    Conditions:
      Type              Status
      Initialized       True
      Ready             False
      ContainersReady   False
      PodScheduled      True
    Volumes:
      config-volume:
        Type:      ConfigMap (a volume populated by a ConfigMap)
        Name:      elasticsearch-logging-curator-elasticsearch-curator-config
        Optional:  false
      kube-api-access-kvk6g:
        Type:                    Projected (a volume that contains injected data from multiple sources)
        TokenExpirationSeconds:  3607
        ConfigMapName:           kube-root-ca.crt
        ConfigMapOptional:       <nil>
        DownwardAPI:             true
    QoS Class:                   BestEffort
    Node-Selectors:              <none>
    Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
    Events:
      Type     Reason                  Age                  From               Message
      ----     ------                  ----                 ----               -------
      Normal   Scheduled               64m                  default-scheduler  Successfully assigned kubesphere-logging-system/elasticsearch-logging-curator-elasticsearch-curator-2767784rhhk to zhiyong-ksp1
      Warning  FailedCreatePodSandBox  64m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "01c36acd52449dcec6b1bcac2a1f3c57577195fd915aef6ca8d1ff53ed9b5a35": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  64m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "c0754d78516e0b4a99993dd31a5608da1b424e558560ea2c66f98856928604a9": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  64m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "0dc2bab36922b4a73c35f3b35ffd4ef46f825fd5b053454c47665d028cd89d61": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  63m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "cc132b632133dbc2ef32eed74bbfb9e64923530467ccd085d67907542a4cfea8": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  63m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8b2f3f1f0d0ebac8a0b43025d22de1c0e1b55edbc72fec6930477061f0b46bbd": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  63m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d6ea17333ad9c2d549f439a25b83fdb8b7338f8e4a00e5fd7adbbab1bc7c78e2": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  63m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d6cafd828a3fa61977ca2423bf953b7aab8f114af042fb272e7172d7f55078a6": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  62m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a555a64631dab504aeacecd828e512b84a5396f0c779b42a1398518740c858d0": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  62m                  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a204a8cd54ce4aa97875269c3475c48266de84a214633ee9eaca8b505df52735": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  FailedCreatePodSandBox  24m (x175 over 62m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "269a0272273b83edeb22c573c3bceeeb40d48bef4cafd0b91da1aa6617b1f3d4": plugin type="calico" failed (add): error getting ClusterInformation: connection is unauthorized: Unauthorized
      Warning  NetworkNotReady         19m (x55 over 21m)   kubelet            network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
      Warning  NetworkNotReady         16m (x5 over 16m)    kubelet            network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
      Warning  FailedMount             16m (x4 over 16m)    kubelet            MountVolume.SetUp failed for volume "kube-api-access-kvk6g" : object "kubesphere-logging-system"/"kube-root-ca.crt" not registered
      Warning  FailedMount             16m (x5 over 16m)    kubelet            MountVolume.SetUp failed for volume "config-volume" : object "kubesphere-logging-system"/"elasticsearch-logging-curator-elasticsearch-curator-config" not registered
      Normal   Pulling                 16m                  kubelet            Pulling image "registry.cn-beijing.aliyuncs.com/kubesphereio/elasticsearch-curator:v5.7.6"
      Normal   Pulled                  13m                  kubelet            Successfully pulled image "registry.cn-beijing.aliyuncs.com/kubesphereio/elasticsearch-curator:v5.7.6" in 3m3.099253003s
      Normal   Created                 13m                  kubelet            Created container elasticsearch-curator
      Normal   Started                 13m                  kubelet            Started container elasticsearch-curator
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85

    可以看出这个pod失败了很久之后,终于成功从registry.cn-beijing.aliyuncs.com/kubesphereio/elasticsearch-curator:v5.7.6拉取到镜像,并且创建及启动了容器elasticsearch-curator。之后其完成了历史使命,正常退出。

    至此,KubeSphere已经成功启动了服务网格Istio

  • 相关阅读:
    第十七章 管理组件库的pull request
    element ui下拉框的@change事件,选择数据,带出数据的相应信息
    【java】实现自定义注解校验——方法一
    cesium 雷达扫描 (扫描线)
    2.3物理层设备
    2023下半年 系统架构师常考概念
    CSS Grid 布局
    java操作达梦数据库报org.springframework.dao.DataIntegrityViolationException异常
    【UCAS自然语言处理作业一】利用BeautifulSoup爬取中英文数据,计算熵,验证齐夫定律
    pytorch训练错误记录
  • 原文地址:https://blog.csdn.net/qq_41990268/article/details/126380224