• Apache DolphinScheduler 如何实现自动化打包+单机/集群部署?


    file

    Apache DolphinScheduler 是一款开源的分布式任务调度系统,旨在帮助用户实现复杂任务的自动化调度和管理。DolphinScheduler 支持多种任务类型,可以在单机或集群环境下运行。下面将介绍如何实现 DolphinScheduler 的自动化打包和单机/集群部署。

    自动化打包

    所需环境:maven、jdk

    执行以下shell完成代码拉取及打包,打包路径:/opt/action/dolphinscheduler/dolphinscheduler-dist/target/apache-dolphinscheduler-dev-SNAPSHOT-bin.tar.gz

    sudo su - root <
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16

    单机部署

    1、DolphinScheduler运行所需环境

    所需环境jdk、zookeeper、mysql

    初始化zookeeper(高版本zookeeper推荐使用v3.8及以上版本)环境

    安装包官网下载地址:https://zookeeper.apache.org/

    sudo su - root <
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22

    jdk、mysql这里不做过多赘述。

    2、初始化配置

    2.1 配置文件初始化

    初始化文件要放到指定目录(本文章以/opt/action/tool举例)

    • 2.1.1新建文件夹
      mkdir -p /opt/action/tool
      mkdir -p /opt/Dsrelease
      • 1
    • 2.1.2新建初始化文件common.properties
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    #
    
    # user data local directory path, please make sure the directory exists and have read write permissions
    data.basedir.path=/tmp/dolphinscheduler
    
    # resource storage type: HDFS, S3, NONE
    resource.storage.type=HDFS
    
    # resource store on HDFS/S3 path, resource file will store to this hadoop hdfs path, self configuration, please make sure the directory exists on hdfs and have read write permissions. "/dolphinscheduler" is recommended
    resource.upload.path=/dolphinscheduler
    
    # whether to startup kerberos
    hadoop.security.authentication.startup.state=false
    
    # java.security.krb5.conf path
    java.security.krb5.conf.path=/opt/krb5.conf
    
    # login user from keytab username
    login.user.keytab.username=hdfs-mycluster@ESZ.COM
    
    # login user from keytab path
    login.user.keytab.path=/opt/hdfs.headless.keytab
    
    # kerberos expire time, the unit is hour
    kerberos.expire.time=2
    # resource view suffixs
    #resource.view.suffixs=txt,log,sh,bat,conf,cfg,py,java,sql,xml,hql,properties,json,yml,yaml,ini,js
    # if resource.storage.type=HDFS, the user must have the permission to create directories under the HDFS root path
    hdfs.root.user=root
    # if resource.storage.type=S3, the value like: s3a://dolphinscheduler; if resource.storage.type=HDFS and namenode HA is enabled, you need to copy core-site.xml and hdfs-site.xml to conf dir
    fs.defaultFS=file:///
    aws.access.key.id=minioadmin
    aws.secret.access.key=minioadmin
    aws.region=us-east-1
    aws.endpoint=http://localhost:9000
    # resourcemanager port, the default value is 8088 if not specified
    resource.manager.httpaddress.port=8088
    # if resourcemanager HA is enabled, please set the HA IPs; if resourcemanager is single, keep this value empty
    yarn.resourcemanager.ha.rm.ids=192.168.xx.xx,192.168.xx.xx
    # if resourcemanager HA is enabled or not use resourcemanager, please keep the default value; If resourcemanager is single, you only need to replace aws2 to actual resourcemanager hostname
    yarn.application.status.address=http://aws2:%s/ws/v1/cluster/apps/%s
    # job history status url when application number threshold is reached(default 10000, maybe it was set to 1000)
    yarn.job.history.status.address=http://aws2:19888/ws/v1/history/mapreduce/jobs/%s
    
    # datasource encryption enable
    datasource.encryption.enable=false
    
    # datasource encryption salt
    datasource.encryption.salt=!@#$%^&*
    
    # data quality option
    data-quality.jar.name=dolphinscheduler-data-quality-dev-SNAPSHOT.jar
    
    #data-quality.error.output.path=/tmp/data-quality-error-data
    
    # Network IP gets priority, default inner outer
    
    # Whether hive SQL is executed in the same session
    support.hive.oneSession=false
    
    # use sudo or not, if set true, executing user is tenant user and deploy user needs sudo permissions; if set false, executing user is the deploy user and doesn't need sudo permissions
    sudo.enable=true
    
    # network interface preferred like eth0, default: empty
    #dolphin.scheduler.network.interface.preferred=
    
    # network IP gets priority, default: inner outer
    #dolphin.scheduler.network.priority.strategy=default
    
    # system env path
    #dolphinscheduler.env.path=dolphinscheduler_env.sh
    
    # development state
    development.state=false
    
    # rpc port
    alert.rpc.port=50052
    
    # Url endpoint for zeppelin RESTful API
    zeppelin.rest.url=http://localhost:8080
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 2.1.3新建初始化文件core-site.xml
    
    
    
    
    
    
    
    fs.defaultFS
    hdfs://aws1
    
    
    ha.zookeeper.quorum
    aws1:2181
    
    
    hadoop.proxyuser.root.hosts
    *
    
    
    hadoop.proxyuser.root.groups
    *
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 2.1.4新建初始化文件hdfs-site.xml
    
    
    
    
    
    
    
    dfs.replication
    1
    
    
    dfs.namenode.name.dir
    /opt/bigdata/hadoop/ha/dfs/name
    
    
    dfs.datanode.data.dir
    /opt/bigdata/hadoop/ha/dfs/data
    
    
    dfs.namenode.secondary.http-address
    aws2:50090
    
    
    dfs.namenode.checkpoint.dir
    /opt/bigdata/hadoop/ha/dfs/secondary
    
    
    dfs.nameservices
    aws1
    
    
    dfs.ha.namenodes.aws1
    nn1,nn2
    
    
    dfs.namenode.rpc-address.aws1.nn1
    aws1:8020
    
    
    dfs.namenode.rpc-address.aws1.nn2
    aws2:8020
    
    
    dfs.namenode.http-address.aws1.nn1
    aws1:50070
    
    
    dfs.namenode.http-address.aws1.nn2
    aws2:50070
    
    
    
    
    
    dfs.datanode.address
    aws1:50010
    
    
    dfs.datanode.ipc.address
    aws1:50020
    
    
    dfs.datanode.http.address
    aws1:50075
    
    
    dfs.datanode.https.address
    aws1:50475
    
    
    dfs.namenode.shared.edits.dir
    qjournal://aws1:8485;aws2:8485;aws3:8485/mycluster
    
    
    dfs.journalnode.edits.dir
    /opt/bigdata/hadoop/ha/dfs/jn
    
    
    dfs.client.failover.proxy.provider.aws1
    org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
    
    
    dfs.ha.fencing.methods
    sshfence
    
    
    dfs.ha.fencing.ssh.private-key-files
    /root/.ssh/id_dsa
    
    
    dfs.ha.automatic-failover.enabled
    true
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63
    • 64
    • 65
    • 66
    • 67
    • 68
    • 69
    • 70
    • 71
    • 72
    • 73
    • 74
    • 75
    • 76
    • 77
    • 78
    • 79
    • 80
    • 81
    • 82
    • 83
    • 84
    • 85
    • 86
    • 87
    • 88
    • 89
    • 90
    • 91
    • 92
    • 93
    • 94
    • 95
    • 96
    • 97
    • 98
    • 99
    • 100
    • 101
    • 102
    • 103
    • 104
    • 105
    • 2.1.5上传初始化jar包mysql-connector-java-8.0.16.jar
    • 2.1.6上传初始化jar包ojdbc8.jar

    2.2 初始化文件替换

    cd /opt/Dsrelease
    sudo rm -r $today/
    
    echo "rm -r $today"
    
    cd /opt/release
    
    cp $packge_tar /opt/Dsrelease
    
    cd /opt/Dsrelease
    
    tar -zxvf $packge_tar
    mv $packge  $today
    p_api_lib=/opt/Dsrelease/$today/api-server/libs/
    p_master_lib=/opt/Dsrelease/$today/master-server/libs/
    p_worker_lib=/opt/Dsrelease/$today/worker-server/libs/
    p_alert_lib=/opt/Dsrelease/$today/alert-server/libs/
    p_tools_lib=/opt/Dsrelease/$today/tools/libs/
    p_st_lib=/opt/Dsrelease/$today/standalone-server/libs/
    
    
    p_api_conf=/opt/Dsrelease/$today/api-server/conf/
    p_master_conf=/opt/Dsrelease/$today/master-server/conf/
    p_worker_conf=/opt/Dsrelease/$today/worker-server/conf/
    p_alert_conf=/opt/Dsrelease/$today/alert-server/conf/
    p_tools_conf=/opt/Dsrelease/$today/tools/conf/
    p_st_conf=/opt/Dsrelease/$today/standalone-server/conf/
    
    
    cp $p0 $p4 $p_api_lib
    cp $p0 $p4 $p_master_lib
    cp $p0 $p4 $p_worker_lib
    cp $p0 $p4 $p_alert_lib
    cp $p0 $p4 $p_tools_lib
    cp $p0 $p4 $p_st_lib
    
    echo "cp $p0 $p_api_lib"
    
    cp $p1 $p2 $p3 $p_api_conf
    cp $p1 $p2 $p3 $p_master_conf
    cp $p1 $p2 $p3 $p_worker_conf
    cp $p1 $p2 $p3 $p_alert_conf
    cp $p1 $p2 $p3 $p_tools_conf
    cp $p1 $p2 $p3 $p_st_conf
    
    
    
    echo "cp $p1 $p2 $p3 $p_api_conf"
    }
    
    define_param(){
    
    packge_tar=apache-dolphinscheduler-dev-SNAPSHOT-bin.tar.gz
    packge=apache-dolphinscheduler-dev-SNAPSHOT-bin
    p0=/opt/action/tool/mysql-connector-java-8.0.16.jar
    p1=/opt/action/tool/common.properties
    p2=/opt/action/tool/core-site.xml
    p3=/opt/action/tool/hdfs-site.xml
    p4=/opt/action/tool/ojdbc8.jar
    
    today=`date +%m%d`
    
    
    }
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59
    • 60
    • 61
    • 62
    • 63

    2.3 配置文件内容替换

    sed  -i 's/spark2/spark/g' /opt/Dsrelease/$today/worker-server/conf/dolphinscheduler_env.sh
    
    cd /opt/Dsrelease/$today/bin/env/
    sed -i '$a\export SPRING_PROFILES_ACTIVE=permission_shiro' dolphinscheduler_env.sh
    sed -i '$a\export DATABASE="mysql"' dolphinscheduler_env.sh
    sed -i '$a\export SPRING_DATASOURCE_DRIVER_CLASS_NAME="com.mysql.jdbc.Driver"' dolphinscheduler_env.sh
    #自定义修改mysql配置
    sed -i '$a\export SPRING_DATASOURCE_URL="jdbc:mysql://ctyun6:3306/dolphinscheduler?useUnicode=true&characterEncoding=UTF-8&allowMultiQueries=true&allowPublicKeyRetrieval=true"' dolphinscheduler_env.sh
    sed -i '$a\export SPRING_DATASOURCE_USERNAME="root"' dolphinscheduler_env.sh
    sed -i '$a\export SPRING_DATASOURCE_PASSWORD="root@123"' dolphinscheduler_env.sh
    echo "替换jdbc配置成功"
    #自定义修改zookeeper配置
    sed -i '$a\export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}' dolphinscheduler_env.sh
    sed -i '$a\export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-ctyun6:2181}' dolphinscheduler_env.sh
    echo "替换zookeeper配置成功"
    sed -i 's/resource.storage.type=HDFS/resource.storage.type=NONE/' /opt/Dsrelease/$today/master-server/conf/common.properties
    sed -i 's/resource.storage.type=HDFS/resource.storage.type=NONE/' /opt/Dsrelease/$today/worker-server/conf/common.properties
    sed -i 's/resource.storage.type=HDFS/resource.storage.type=NONE/' /opt/Dsrelease/$today/alert-server/conf/common.properties
    sed -i 's/resource.storage.type=HDFS/resource.storage.type=NONE/' /opt/Dsrelease/$today/api-server/conf/common.properties
    sed -i 's/hdfs.root.user=root/resource.hdfs.root.user=root/' /opt/Dsrelease/$today/master-server/conf/common.properties
    sed -i 's/hdfs.root.user=root/resource.hdfs.root.user=root/' /opt/Dsrelease/$today/worker-server/conf/common.properties
    sed -i 's/hdfs.root.user=root/resource.hdfs.root.user=root/' /opt/Dsrelease/$today/alert-server/conf/common.properties
    sed -i 's/hdfs.root.user=root/resource.hdfs.root.user=root/' /opt/Dsrelease/$today/api-server/conf/common.properties
    sed -i 's/fs.defaultFS=file:/resource.fs.defaultFS=file:/' /opt/Dsrelease/$today/master-server/conf/common.properties
    sed -i 's/fs.defaultFS=file:/resource.fs.defaultFS=file:/' /opt/Dsrelease/$today/worker-server/conf/common.properties
    sed -i 's/fs.defaultFS=file:/resource.fs.defaultFS=file:/' /opt/Dsrelease/$today/alert-server/conf/common.properties
    sed -i 's/fs.defaultFS=file:/resource.fs.defaultFS=file:/' /opt/Dsrelease/$today/api-server/conf/common.properties
    sed -i '$a\resource.hdfs.fs.defaultFS=file:///' /opt/Dsrelease/$today/api-server/conf/common.properties
    echo "替换common.properties配置成功"
    # 替换master worker内存 api alert也可进行修改,具体根据当前服务器硬件配置而定,但要遵循Xms=Xmx=2Xmn的规律
    cd /opt/Dsrelease/$today/
    sed -i 's/Xms4g/Xms2g/g' worker-server/bin/start.sh
    sed -i 's/Xmx4g/Xmx2g/g' worker-server/bin/start.sh
    sed -i 's/Xmn2g/Xmn1g/g' worker-server/bin/start.sh
    sed -i 's/Xms4g/Xms2g/g' master-server/bin/start.sh
    sed -i 's/Xmx4g/Xmx2g/g' master-server/bin/start.sh
    sed -i 's/Xmn2g/Xmn1g/g' master-server/bin/start.sh
    echo "master worker内存修改完成"
    
    }
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39

    3、删除HDFS配置

    echo "开始删除hdfs配置"
    sudo rm /opt/Dsrelease/$today/api-server/conf/core-site.xml
    sudo rm /opt/Dsrelease/$today/api-server/conf/hdfs-site.xml
    sudo rm /opt/Dsrelease/$today/worker-server/conf/core-site.xml
    sudo rm /opt/Dsrelease/$today/worker-server/conf/hdfs-site.xml
    sudo rm /opt/Dsrelease/$today/master-server/conf/core-site.xml
    sudo rm /opt/Dsrelease/$today/master-server/conf/hdfs-site.xml
    sudo rm /opt/Dsrelease/$today/alert-server/conf/core-site.xml
    sudo rm /opt/Dsrelease/$today/alert-server/conf/hdfs-site.xml
    echo "结束删除hdfs配置"
    }
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10

    4、MySQL初始化

    init_mysql(){
    
    sql_path="/opt/Dsrelease/$today/tools/sql/sql/dolphinscheduler_mysql.sql"
    sourceCommand="source $sql_path"
    echo $sourceCommand
    echo "开始source:"
    mysql -hlocalhost -uroot -proot@123 -D "dolphinscheduler" -e "$sourceCommand"
    echo "结束source:"
    }
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    5、启动DolphinScheduler服务

    stop_all_server(){
    cd /opt/Dsrelease/$today
    ./bin/dolphinscheduler-daemon.sh stop api-server
    ./bin/dolphinscheduler-daemon.sh stop master-server
    ./bin/dolphinscheduler-daemon.sh stop worker-server
    ./bin/dolphinscheduler-daemon.sh stop alert-server
    ps -ef|grep api-server|grep -v grep|awk '{print "kill -9 " $2}' |sh
    ps -ef|grep master-server |grep -v grep|awk '{print "kill -9 " $2}' |sh
    ps -ef|grep worker-server |grep -v grep|awk '{print "kill -9 " $2}' |sh
    ps -ef|grep alert-server |grep -v grep|awk '{print "kill -9 " $2}' |sh
    }
    
    run_all_server(){
    cd /opt/Dsrelease/$today
    ./bin/dolphinscheduler-daemon.sh start api-server
    ./bin/dolphinscheduler-daemon.sh start master-server
    ./bin/dolphinscheduler-daemon.sh start worker-server
    ./bin/dolphinscheduler-daemon.sh start alert-server
    }
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18

    集群部署

    1、开放mysql和zookeeper对外端口 2、集群部署及启动

    复制完成初始化的文件夹到指定的服务器,启动指定服务即可完成集群部署,要连同一个Zookeeper和MySQL。

    本文由 白鲸开源科技 提供发布支持!

  • 相关阅读:
    2022_08_13__106期__排序
    Java23种设计模式-结构型模式之外观模式
    C#基础:类class与结构struct的区别
    仿美团外卖微信小程序源码/美团外卖优惠券领劵小程序-自带流量主模式
    外业精灵,在水土流失监测野外调查工作中的应用
    ROS C++程序终止/结束进程&& 多线程终止运行程序
    NVIDIA DALI学习:数据加载
    【SIFT】超详详详解 - 实现细节记录
    hugo-stack for github
    简明docker安装
  • 原文地址:https://blog.csdn.net/DolphinScheduler/article/details/132838583