• 【云原生】Apache Livy on k8s 讲解与实战操作


    一、概述

    Livy是一个提供Rest接口和spark集群交互的服务。它可以提交Spark Job或者Spark一段代码,同步或者异步的返回结果;也提供Sparkcontext的管理,通过Restful接口或RPC客户端库。Livy也简化了与Spark与应用服务的交互,这允许通过web/mobile与Spark的使用交互。

    官网:https://livy.incubator.apache.org/
    GitHub地址:https://github.com/apache/incubator-livy
    关于Apache Livy更多介绍也可以参考我这篇文章:Spark开源REST服务——Apache Livy(Spark 客户端)

    二、开始编排部署

    1)部署包准备

    这里也提供上面编译好的livy部署包,有需要的小伙伴可以自行下载:

    链接:https://pan.baidu.com/s/1pPCbe0lUJ6ji8rvQYsVw9A?pwd=qn7i
    提取码:qn7i

    1)构建镜像

    Dockerfile

    FROM myharbor.com/bigdata/centos:7.9.2009
    
    RUN rm -f /etc/localtime && ln -sv /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo "Asia/Shanghai" > /etc/timezone
    RUN export LANG=zh_CN.UTF-8
    
    ### install tools
    RUN yum install -y vim tar wget curl less telnet net-tools lsof
    
    RUN groupadd --system --gid=9999 admin && useradd --system -m /home/admin --uid=9999 --gid=admin admin
    
    RUN mkdir -p /opt/apache
    
    ADD apache-livy-0.8.0-incubating-SNAPSHOT-bin.zip /opt/apache/
    ENV LIVY_HOME=/opt/apache/apache-livy
    RUN ln -s /opt/apache/apache-livy-0.8.0-incubating-SNAPSHOT-bin $LIVY_HOME
    
    ADD hadoop-3.3.2.tar.gz /opt/apache/
    ENV HADOOP_HOME=/opt/apache/hadoop
    RUN ln -s /opt/apache/hadoop-3.3.2 $HADOOP_HOME
    ENV HADOOP_CONFIG_DIR=${HADOOP_HOME}/etc/hadoop
    
    ADD spark-3.3.0-bin-hadoop3.tar.gz /opt/apache/
    ENV SPARK_HOME=/opt/apache/spark
    RUN ln -s /opt/apache/spark-3.3.0-bin-hadoop3 $SPARK_HOME
    
    ENV PATH=${LIVY_HOME}/bin:${HADOOP_HOME}/bin:${SPARK_HOME}/bin:$PATH
    
    RUN chown -R admin:admin /opt/apache
    
    WORKDIR $LIVY_HOME
    
    ENTRYPOINT ${LIVY_HOME}/bin/livy-server start;tail -f ${LIVY_HOME}/logs/livy-root-server.out
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32

    【注意】hadoop包里的core-site.xmlhdfs-site.xmlyarn-site.xml

    开始构建镜像

    docker build -t myharbor.com/bigdata/livy:0.8.0 . --no-cache
    
    ### 参数解释
    # -t:指定镜像名称
    # . :当前目录Dockerfile
    # -f:指定Dockerfile路径
    #  --no-cache:不缓存
    
    # 推送到harbor
    docker push myharbor.com/bigdata/livy:0.8.0
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    2)创建livy chart模板

    helm create livy
    
    • 1

    3)修改yaml编排

    • livy/values.yaml
    replicaCount: 1
    
    image:
      repository: myharbor.com/bigdata/livy
      pullPolicy: IfNotPresent
      # Overrides the image tag whose default is the chart appVersion.
      tag: "0.8.0"
    
    securityContext:
      runAsUser: 9999
      runAsGroup: 9999
      privileged: true
    
    service:
      type: NodePort
      port: 8998
      nodePort: 31998
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • livy/templates/configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: {{ include "livy.fullname" . }}
      labels:
        {{- include "livy.labels" . | nindent 4 }}
    data:
      livy.conf: |-
        livy.spark.master = yarn
        livy.spark.deploy-mode = client
        livy.environment = production
        livy.impersonation.enabled = true
        livy.server.csrf_protection.enabled = false
        livy.server.port = {{ .Values.service.port }}
        livy.server.session.timeout = 3600000
        livy.server.recovery.mode = recovery
        livy.server.recovery.state-store = filesystem
        livy.server.recovery.state-store.url = /tmp/livy
        livy.repl.enable-hive-context = true
      livy-env.sh: |-
        export JAVA_HOME=/opt/apache/jdk1.8.0_212
        export HADOOP_HOME=/opt/apache/hadoop
        export HADOOP_CONF_DIR=/opt/apache/hadoop/etc/hadoop
        export SPARK_HOME=/opt/apache/spark
        export SPARK_CONF_DIR=/opt/apache/spark/conf
        export LIVY_LOG_DIR=/opt/apache/livy/logs
        export LIVY_PID_DIR=/opt/apache/livy/pid-dir
        export LIVY_SERVER_JAVA_OPTS="-Xmx512m"
      spark-blacklist.conf: |-
        spark.master
        spark.submit.deployMode
    
        # Disallow overriding the location of Spark cached jars.
        spark.yarn.jar
        spark.yarn.jars
        spark.yarn.archive
    
        # Don't allow users to override the RSC timeout.
        livy.rsc.server.idle-timeout
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • livy/templates/deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: {{ include "livy.fullname" . }}
      labels:
        {{- include "livy.labels" . | nindent 4 }}
    spec:
      {{- if not .Values.autoscaling.enabled }}
      replicas: {{ .Values.replicaCount }}
      {{- end }}
      selector:
        matchLabels:
          {{- include "livy.selectorLabels" . | nindent 6 }}
      template:
        metadata:
          {{- with .Values.podAnnotations }}
          annotations:
            {{- toYaml . | nindent 8 }}
          {{- end }}
          labels:
            {{- include "livy.selectorLabels" . | nindent 8 }}
        spec:
          {{- with .Values.imagePullSecrets }}
          imagePullSecrets:
            {{- toYaml . | nindent 8 }}
          {{- end }}
          serviceAccountName: {{ include "livy.serviceAccountName" . }}
          securityContext:
            {{- toYaml .Values.podSecurityContext | nindent 8 }}
          containers:
            - name: {{ .Chart.Name }}
              securityContext:
                {{- toYaml .Values.securityContext | nindent 12 }}
              image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
              imagePullPolicy: {{ .Values.image.pullPolicy }}
              ports:
                - name: http
                  containerPort: 8998
                  protocol: TCP
              livenessProbe:
                httpGet:
                  path: /
                  port: http
              readinessProbe:
                httpGet:
                  path: /
                  port: http
              resources:
                {{- toYaml .Values.resources | nindent 12 }}
              {{- with .Values.securityContext }}
              securityContext:
                runAsUser: {{ .runAsUser }}
                runAsGroup: {{ .runAsGroup }}
                privileged: {{ .privileged }}
              {{- end }}
              volumeMounts:
                - name: {{ .Release.Name }}-livy-conf
                  mountPath: /opt/apache/livy/conf/livy.conf
                  subPath: livy.conf
                - name: {{ .Release.Name }}-livy-env
                  mountPath: /opt/apache/livy/conf/livy-env.sh
                  subPath: livy-env.sh
                - name: {{ .Release.Name }}-spark-blacklist-conf
                  mountPath: /opt/apache/livy/conf/spark-blacklist.conf
                  subPath: spark-blacklist.conf
          {{- with .Values.nodeSelector }}
          nodeSelector:
            {{- toYaml . | nindent 8 }}
          {{- end }}
          {{- with .Values.affinity }}
          affinity:
            {{- toYaml . | nindent 8 }}
          {{- end }}
          {{- with .Values.tolerations }}
          tolerations:
            {{- toYaml . | nindent 8 }}
          {{- end }}
          volumes:
            - name: {{ .Release.Name }}-livy-conf
              configMap:
                name: {{ include "livy.fullname" . }}
            - name: {{ .Release.Name }}-livy-env
              configMap:
                name: {{ include "livy.fullname" . }}
            - name: {{ .Release.Name }}-spark-blacklist-conf
              configMap:
                name: {{ include "livy.fullname" . }}
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87

    4)开始部署

    helm install livy ./livy -n livy --create-namespace
    
    • 1

    NOTES

    NOTES:
    1. Get the application URL by running these commands:
      export NODE_PORT=$(kubectl get --namespace livy -o jsonpath="{.spec.ports[0].nodePort}" services livy)
      export NODE_IP=$(kubectl get nodes --namespace livy -o jsonpath="{.items[0].status.addresses[0].address}")
      echo http://$NODE_IP:$NODE_PORT
    
    • 1
    • 2
    • 3
    • 4
    • 5

    在这里插入图片描述

    查看

    kubectl get pods,svc -n livy -owide
    
    • 1

    在这里插入图片描述

    web地址:http://192.168.182.110:31998/ui
    在这里插入图片描述

    5)测试验证

    curl -s -XPOST -d '{"file":"hdfs://myhdfs/tmp/spark-examples_2.12-3.3.0.jar","className":"org.apache.spark.examples.SparkPi","name":"SparkPi-test"}'  -H "Content-Type: application/json"  http://local-168-182-110:31998/batches|python -m json.tool
    
    • 1

    在这里插入图片描述
    在这里插入图片描述

    6)卸载

    helm uninstall livy -n livy
    
    • 1

    git地址:https://gitee.com/hadoop-bigdata/livy-on-k8s

    Apache Livy on k8s 编排部署讲解就先到这里了,有疑问的小伙伴欢迎给我留言,后续会持续更新【云原生+大数据】相关的文章,请小伙伴耐心等待~

  • 相关阅读:
    07_面向对象高级_内部类
    【Docker】简易版harbor部署
    【Vue入门】MVVM数据双向绑定与Vue的生命周期
    C++算法 —— 动态规划(7)两个数组的dp
    10-Node.js入门
    笔记本电脑热点手机无法连接解决方案
    JavaSE进阶21天---第十七天---JavaSE(打印流、FileUtils工具类、 ImageIO(验证码、二维码))
    ‘vue‘ 不是内部或外部命令,也不是可运行的程序
    Nacos整合到SpringCloud中
    hadoop 3.3大数据集群搭建系列3-安装Hive
  • 原文地址:https://blog.csdn.net/qq_35745940/article/details/127718546