• 大规模场景下对Istio的性能优化


    简介

    当前istio下发xDS使用的是全量下发策略,也就是网格里的所有sidecar(envoy),内存里都会有整个网格内所有的服务发现数据。这样的结果是,每个sidecar内存都会随着网格规模增长而增长。

    Aeraki-mesh

    aeraki-mesh项目下有一个子项目专门用来处理istio配置分发性能问题,我们找到该项目:
    https://github.com/aeraki-mesh/lazyxds

    从该项目的部署yaml中,我们知道它会在网格中增加两个组件:

    Egress

    对应的配置文件为:lazyxds-egress.yaml
    下面来一一查看该组件的组成部分

    组件配置
    1. apiVersion: apps/v1
    2. kind: Deployment
    3. metadata:
    4.   name: istio-egressgateway-lazyxds
    5.   namespace: istio-system
    6.   labels:
    7.     app: istio-egressgateway-lazyxds
    8.     istio: egressgateway
    9. spec:
    10.   replicas: 1
    11.   selector:
    12.     matchLabels:
    13.       app: istio-egressgateway-lazyxds
    14.       istio: egressgateway
    15.   template:
    16.     metadata:
    17.       annotations:
    18.         sidecar.istio.io/discoveryAddress: istiod.istio-system.svc:15012
    19.         sidecar.istio.io/inject: "false"
    20.       labels:
    21.         app: istio-egressgateway-lazyxds
    22.         istio: egressgateway
    23.     spec:
    24.       containers:
    25.         - args:
    26.             ......
    27.           image: docker.io/istio/proxyv2:1.10.0
    28.           imagePullPolicy: IfNotPresent
    29.           name: istio-proxy
    30.           ports:
    31.             - containerPort: 8080
    32.               protocol: TCP
    33.             - containerPort: 15090
    34.               name: http-envoy-prom
    35.               protocol: TCP
    36.           ......
    37.           volumeMounts:
    38.             - mountPath: /etc/istio/custom-bootstrap
    39.               name: custom-bootstrap-volume
    40.             ......
    41.       volumes:
    42.         - configMap:
    43.             defaultMode: 420
    44.             name: lazyxds-als-bootstrap
    45.           name: custom-bootstrap-volume

    由于配置太多,这里只挑选主要的部分,从上面可以看出,其实是启动一个istio proxy,该proxy的启动配置文件是使用的configmap挂载出来的。

    启动配置
    1. apiVersion: v1
    2. kind: ConfigMap
    3. metadata:
    4.   name: lazyxds-als-bootstrap
    5.   namespace: istio-system
    6. data:
    7.   custom_bootstrap.json: |
    8.     {
    9.       "static_resources": {
    10.         "clusters": [{
    11.           "name": "lazyxds-accesslog-service",
    12.           "type": "STRICT_DNS",
    13.           "connect_timeout": "1s",
    14.           "http2_protocol_options": {},
    15.           "dns_lookup_family": "V4_ONLY",
    16.           "load_assignment": {
    17.             "cluster_name": "lazyxds-accesslog-service",
    18.             "endpoints": [{
    19.               "lb_endpoints": [{
    20.                 "endpoint": {
    21.                   "address": {
    22.                     "socket_address": {
    23.                       "address": "lazyxds.istio-system",
    24.                       "port_value": 8080
    25.                     }
    26.                   }
    27.                 }
    28.               }]
    29.             }]
    30.           },
    31.           "respect_dns_ttl": true
    32.         }]
    33.       }
    34.     }

    从上面配置可以知道:

    • 定义了proxy组件代理的集群,该集群为"lazyxds-accesslog-service"

    • 该集群对应的后端服务地址为"lazyxds.istio-system",端口为8080
      这个后端就是lazyxds controller,后面细说

    EnvoyFilter

    从yaml文件我们看到,还定义了一个envoyfilter来修改proxy代理的流量配置

    1. apiVersion: networking.istio.io/v1alpha3
    2. kind: EnvoyFilter
    3. metadata:
    4.   name: lazyxds-egress-als
    5.   namespace: istio-system
    6. spec:
    7.   workloadSelector:
    8.     labels:
    9.       app: istio-egressgateway-lazyxds
    10.   configPatches:
    11.     - applyTo: NETWORK_FILTER
    12.       match:
    13.         context: GATEWAY
    14.         listener:
    15.           filterChain:
    16.             filter:
    17.               name: "envoy.filters.network.http_connection_manager"
    18.       patch:
    19.         operation: MERGE
    20.         value:
    21.           typed_config:
    22.             "@type": "type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager"
    23.             access_log:
    24.               ......
    25.               - name: envoy.access_loggers.http_grpc
    26.                 typed_config:
    27.                   "@type": type.googleapis.com/envoy.extensions.access_loggers.grpc.v3.HttpGrpcAccessLogConfig
    28.                   common_config:
    29.                     log_name: http_envoy_accesslog
    30.                     transport_api_version: "V3"
    31.                     grpc_service:
    32.                       envoy_grpc:
    33.                         cluster_name: lazyxds-accesslog-service

    从这个配置文件,可以看出在启动envoy时,会向其注入一个accesslog service,也就是envoy的日志收集器,而这个service就是lazyxds-accesslog-service

    Controller

    具体的lazy xds实现就是通过这个controller实现的

    1. apiVersion: apps/v1
    2. kind: Deployment
    3. metadata:
    4.   labels:
    5.     app: lazyxds
    6.   name: lazyxds
    7.   namespace: istio-system
    8. spec:
    9.   replicas: 1
    10.   selector:
    11.     matchLabels:
    12.       app: lazyxds
    13.   template:
    14.     metadata:
    15.       labels:
    16.         app: lazyxds
    17.     spec:
    18.       serviceAccountName: lazyxds
    19.       containers:
    20.         - image: aeraki/lazyxds:latest
    21.           imagePullPolicy: Always
    22.           name: app
    23.           ports:
    24.             - containerPort: 8080
    25.               protocol: TCP
    26. ---
    27. apiVersion: v1
    28. kind: Service
    29. metadata:
    30.   labels:
    31.     app: lazyxds
    32.   name: lazyxds
    33.   namespace: istio-system
    34. spec:
    35.   ports:
    36.     - name: grpc-als
    37.       port: 8080
    38.       protocol: TCP
    39.   selector:
    40.     app: lazyxds
    41.   type: ClusterIP

    从配置可以看到,在egress环节我们知道了proxy的代理的后端地址为lazyxds.istio-system,刚好对应这里的controller。

    并且我们还知道,envoy的访问日志最终会发送给这个controller来处理,而这就是实现增量下发envoy配置的关键之处,也就是解决istio性能的解决之法。

    增量下发

    Accesslog接口

    要接受envoy的访问日志,必须实现envoy定义的接口:

    1. type AccessLogServiceServer interface {
    2.    // Envoy will connect and send StreamAccessLogsMessage messages forever. It does not expect any
    3.    // response to be sent as nothing would be done in the case of failure. The server should
    4.    // disconnect if it expects Envoy to reconnect. In the future we may decide to add a different
    5.    // API for "critical" access logs in which Envoy will buffer access logs for some period of time
    6.    // until it gets an ACK so it could then retry. This API is designed for high throughput with the
    7.    // expectation that it might be lossy.
    8.    StreamAccessLogs(AccessLogService_StreamAccessLogsServer) error
    9. }

    日志解析

    lazyxds实现如下:

    1. func (server *Server) StreamAccessLogs(logStream als.AccessLogService_StreamAccessLogsServer) error {
    2.    for {
    3.       data, err := logStream.Recv()
    4.       if err != nil {
    5.          return err
    6.       }
    7.       httpLog := data.GetHttpLogs()
    8.       if httpLog != nil {
    9.          for _, entry := range httpLog.LogEntry {
    10.             server.log.V(4).Info("http log entry", "entry", entry)
    11.             fromIP := getDownstreamIP(entry)
    12.             if fromIP == "" {
    13.                continue
    14.             }
    15.             upstreamCluster := entry.CommonProperties.UpstreamCluster
    16.             svcID := utils.UpstreamCluster2ServiceID(upstreamCluster)
    17.             toIP := getUpstreamIP(entry)
    18.             if err := server.handler.HandleAccess(fromIP, svcID, toIP); err != nil {
    19.                server.log.Error(err, "handle access error")
    20.             }
    21.          }
    22.       }
    23.    }
    24. }

    上面主要的逻辑就是解析envoy的访问日志,然后进行处理:

    • lazy xds Controller 会对接收到的日志进行访问关系分析,然后把新的依赖关系表达到 sidecar CRD 中。

    • 同时 Controller 还会更新 Egress 的规则:删除、更新或创建。

    Slime

    网易Slime方案与腾讯云Aeraki方案的思路一致
    文档:https://cloudnative.to/blog/netease-slime/
    github:https://github.com/slime-io/slime/tree/master/staging/src/slime.io/slime/modules/lazyload

    https://cloud.tencent.com/developer/article/1922778

    https://www.zhaohuabing.com/post/2018-09-25-istio-traffic-management-impl-intro/

  • 相关阅读:
    分布式事务:XA和Seata的XA模式 | 京东物流技术团队
    API接口采集商品评论数据,item_review-获得淘宝商品评论
    Zookeeper:Mac通过Docker安装Zookeeper集群
    win10+RTX3050ti+TensorFlow+cudn+cudnn配置深度学习环境
    基于Spring框架搭建网站实验
    nginx 详细的使用教程
    数字IC设计笔试常见大题整理(简答+手撕)
    海外媒体发稿:葡萄牙-实现高效媒体软文发稿计划-大舍传媒
    美容美发店会员管理系统开发_分享美容美发店做会员管理系统的好处
    Dubbo基础
  • 原文地址:https://blog.csdn.net/m0_47495420/article/details/132595373