• Istio 自动注入 sidecar 失败导致无法访问webhook服务


            最近工作中在部署Istio环境的过程中发现官方示例启动的pod不能访问不到Istio的webhook,这个问题也是困扰了我一天,特此记录,便于日后查阅。

            我把他归类到sidecar注入失败的情况,报错如下:

            1、第一种可能(我遇到的情况)

            如果自动注入时,报如下错误信息:

    1. 2023-10-26T02:15:22.051580Z error installer Internal error occurred: failed calling webhook "rev.validation.istio.io": failed to call webhook: Post "https://istiod.istio-system.svc:443/validate?timeout=10s": context deadline exceeded
    2. 2023-10-26T02:15:32.149109Z error installer failed to create "EnvoyFilter/istio-system/stats-filter-1.14": Internal error occurred: failed calling webhook "rev.validation.istio.io": failed to call webhook: Post "https://istiod.istio-system.svc:443/validate?timeout=10s": context deadline exceeded

            造成上述问题的原因是 kube-apiserver 的 --enable-admission-plugins 没有配置 MutatingAdmissionWebhook,ValidatingAdmissionWebhook参数,所以解决问题的方法就是找到 kube-apiserver.yaml 配置文件,我是通过kubeadm安装的,所以路径在 /etc/kubernetes/manifests 下,把 --enable-admission-plugins 修改为 NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook 后保存文件,删掉 kube-apiserver 的pod,让他自己根据新的配置文件重新启动(注意:如果是用kubeadm安装的修改内容如果错误,可能会导致k8s集群中的kube-apiserver全部挂掉,导致无法访问集群,这时就需要从官网下载kube-apiserver二进制文件,重新拉起一个进程,再执行删除pod的操作

            注:这里我多加了一个配置 --feature-gates=ServerSideApply=false,这个配置是关闭特性门控服务端自动更新,因为我还报了另外一个错误:failed to update resource with server-side apply for obj

    1. - --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,NodeRestriction
    2. - --enable-aggregator-routing=true
    3. - --feature-gates=ServerSideApply=false

            修改后文件内容如下:

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. annotations:
    5. kubeadm.kubernetes.io/kube-apiserver.advertise-address.endpoint: 192.168.1.61:6443
    6. creationTimestamp: null
    7. labels:
    8. component: kube-apiserver
    9. tier: control-plane
    10. name: kube-apiserver
    11. namespace: kube-system
    12. spec:
    13. containers:
    14. - command:
    15. - kube-apiserver
    16. - --advertise-address=192.168.1.61
    17. - --allow-privileged=true
    18. - --authorization-mode=Node,RBAC
    19. - --client-ca-file=/etc/kubernetes/pki/ca.crt
    20. - --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota,NodeRestriction
    21. - --enable-aggregator-routing=true
    22. - --enable-bootstrap-token-auth=true
    23. - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
    24. - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
    25. - --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
    26. - --etcd-servers=https://127.0.0.1:2379
    27. - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
    28. - --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
    29. - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    30. - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
    31. - --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
    32. - --requestheader-allowed-names=front-proxy-client
    33. - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
    34. - --requestheader-extra-headers-prefix=X-Remote-Extra-
    35. - --requestheader-group-headers=X-Remote-Group
    36. - --requestheader-username-headers=X-Remote-User
    37. - --secure-port=6443
    38. - --service-account-issuer=https://kubernetes.default.svc.cluster.local
    39. - --service-account-key-file=/etc/kubernetes/pki/sa.pub
    40. - --service-account-signing-key-file=/etc/kubernetes/pki/sa.key
    41. - --service-cluster-ip-range=10.96.0.0/12
    42. - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
    43. - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    44. - --feature-gates=RemoveSelfLink=false,ServerSideApply=false
    45. image: registry.aliyuncs.com/google_containers/kube-apiserver:v1.23.0
    46. imagePullPolicy: IfNotPresent
    47. livenessProbe:
    48. failureThreshold: 8
    49. httpGet:
    50. host: 192.168.1.61
    51. path: /livez
    52. port: 6443
    53. scheme: HTTPS
    54. initialDelaySeconds: 10
    55. periodSeconds: 10
    56. timeoutSeconds: 15
    57. name: kube-apiserver
    58. readinessProbe:
    59. failureThreshold: 3
    60. httpGet:
    61. host: 192.168.1.61
    62. path: /readyz
    63. port: 6443
    64. scheme: HTTPS
    65. periodSeconds: 1
    66. timeoutSeconds: 15
    67. resources:
    68. requests:
    69. cpu: 250m
    70. startupProbe:
    71. failureThreshold: 24
    72. httpGet:
    73. host: 192.168.1.61
    74. path: /livez
    75. port: 6443
    76. scheme: HTTPS
    77. initialDelaySeconds: 10
    78. periodSeconds: 10
    79. timeoutSeconds: 15
    80. volumeMounts:
    81. - mountPath: /etc/ssl/certs
    82. name: ca-certs
    83. readOnly: true
    84. - mountPath: /etc/pki
    85. name: etc-pki
    86. readOnly: true
    87. - mountPath: /etc/kubernetes/pki
    88. name: k8s-certs
    89. readOnly: true
    90. hostNetwork: true
    91. priorityClassName: system-node-critical
    92. securityContext:
    93. seccompProfile:
    94. type: RuntimeDefault
    95. volumes:
    96. - hostPath:
    97. path: /etc/ssl/certs
    98. type: DirectoryOrCreate
    99. name: ca-certs
    100. - hostPath:
    101. path: /etc/pki
    102. type: DirectoryOrCreate
    103. name: etc-pki
    104. - hostPath:
    105. path: /etc/kubernetes/pki
    106. type: DirectoryOrCreate
    107. name: k8s-certs
    108. status: {}

            如果上述配置你已经配过了,还是出现错误: failed to create "EnvoyFilter/istio-system/stats-filter-1.16": Internal error occurred: failed calling webhook "rev.validation.istio.io": failed to call webhook: Post "https://istiod.istio-system.svc:443/validate?timeout=10s": context deadline exceeded,按照下述操作排除Istio网络问题(懒得翻译咯~):

            Injection works by the API server connecting to the webhook deployment (Istiod). This may cause issues if there are connectivity issues, such as firewalls, blocking this call. Depending on the Kubernetes configuration, this may required a firewall rule on port 443 or port 15017; instructions for doing so on GKE can be found here.

            In order to check if the API server can access the pod, we can send a request proxied through the api server:

            An example of a request that succeeds (no body found is returned from the service and indicates we do have connectivity):

    1. $ kubectl get --raw /api/v1/namespaces/istio-system/services/https:istiod:https-webhook/proxy/inject -v4
    2. I0618 07:39:46.663871 36880 helpers.go:216] server response object: [{
    3. "metadata": {},
    4. "status": "Failure",
    5. "message": "the server rejected our request for an unknown reason",
    6. "reason": "BadRequest",
    7. "details": {
    8. "causes": [
    9. {
    10. "reason": "UnexpectedServerResponse",
    11. "message": "no body found"
    12. }
    13. ]
    14. },
    15. "code": 400
    16. }]
    17. F0618 07:39:46.663940 36880 helpers.go:115] Error from server (BadRequest): the server rejected our request for an unknown reason

            Similarly, we can send a request from another pod:

    1. $ curl https://istiod.istio-system:443/inject -k
    2. no body found

             And from the istiod pod directly (note: the port here is 15017, as this is the targetPort for the Service):

    1. $ curl https://localhost:15017/inject -k
    2. no body found

            With this information you should be able to isolate where the breakage occurs.

            如果Istio网络没有问题,还是出现错误: failed to create "EnvoyFilter/istio-system/stats-filter-1.16": Internal error occurred: failed calling webhook "rev.validation.istio.io": failed to call webhook: Post "https://istiod.istio-system.svc:443/validate?timeout=10s": context deadline exceeded,可以不用怀疑,这一定是你的服务器资源不够用导致出现的问题,因为Docker/Containerd守护进程正在处理大量的请求或者运行多个容器,那么它的响应时间可能会延迟,导致"Context Deadline Exceeded"错误,所以请扩展你的Kubernetes集群服务器的内存/硬盘/CPU,这个问题也就解决了。

            我的情况排查到最后发现是Kubernetes集群内存/硬盘/CPU资源不够导致的,多执行几遍Istio安装命令就安装成功了:

            2、第二种可能

            安装 Istio 时,配置了 enableNamespacesByDefault: false

    1. sidecarInjectorWebhook:
    2. enabled: true
    3. # 变量为true,就会为所有命名空间开启自动注入功能。如果赋值为false,则只有标签为istio-injection的命名空间才会开启自动注入功能
    4. enableNamespacesByDefault: false
    5. rewriteAppHTTPProbe: false

            解决方法:

    1. # 设置标签
    2. $ kubectl label namespace default istio-injection=enabled --overwrite
    3. # 查看
    4. $ kubectl get namespace -L istio-injection
    5. NAME STATUS AGE ISTIO-INJECTION
    6. default Active 374d enabled

            如果要重新禁用注入istio sidecar,执行下面命令:

    $ kubectl label namespace default istio-injection=disabled --overwrite

            3、第三种可能

            安装 Istio 时,设置 autoInject: disabled

    1. proxy:
    2. includeIPRanges: 192.168.16.0/20,192.168.32.0/20
    3. # 是否开启自动注入功能,取值enabled则该pods只要没有被注解为sidecar.istio.io/inject: "false",就会自动注入。如果取值为disabled,则需要为pod设置注解sidecar.istio.io/inject: "true"才会进行注入
    4. autoInject: disabled

            解决方法:

    • 第一个方法:设置 autoInject: enabled
    • 第二个方法:在 Pod 或者 Deployment 声明 sidecar.istio.io/inject: "true"
  • 相关阅读:
    “码二代”从喜欢益智游戏到找最短路线,编程思维是如何培养的?
    Android 10.0 Launcher3禁用widget微件功能实现
    自定义Dynamics 365实施和发布业务解决方案 - 4. 自动化业务流程
    如何在一个月内写完论文?
    Java中代理的实现方式
    Python爬虫|Scrapy 基础用法
    2347. 最好的扑克手牌-双百代码
    适合大学生的笔记软件评测:云笔记.离线笔记、手写笔记、写作软件
    Windows/Linux双系统卸载Ubuntu
    STA -- clock gating check
  • 原文地址:https://blog.csdn.net/qq_19734597/article/details/134036682