• 9. seatunnel-incubating-2.1.2安装部署


    环境说明:

    主机名:cmcc01为例 

    操作系统:centos7

    安装部署软件版本部署方式
    centos7
    zookeeperzookeeper-3.4.10伪分布式
    hadoophadoop-3.1.3伪分布式
    hivehive-3.1.3-bin伪分布式
    clickhouse21.11.10.1-2单节点多实例
    dolphinscheduler3.0.0单节点
    kettlepdi-ce-9.3.0.0单节点
    sqoopsqoop-1.4.7单节点
    seatunnelseatunnel-incubating-2.1.2单节点
    sparkspark- 2.4.8-bin-hadoop2.7.tgz单节点

    1.下载 seatunnel、spark

    seatunnel:https://seatunnel.incubator.apache.org/download

    spark:https://archive.apache.org/dist/spark/

    解压:

    1. # 解压
    2. tar zxf apache-seatunnel-incubating-2.1.2-bin.tar.gz -C /opt/software/
    3. tar zxf spark-2.4.8-bin-hadoop2.7.tgz -C /opt/software/

    2.配置环境变量

    1. vim ~/.bash_profile
    2. # spark
    3. export SPARK_HOME=/opt/software/spark-2.4.8-bin-hadoop2.7
    4. export PATH=$PATH:${SPARK_HOME}/bin
    5. # seatunnel
    6. export SEATUNNEL_HOME=/opt/software/apache-seatunnel-incubating-2.1.2
    7. export PATH=$PATH:${SEATUNNEL_HOME}/bin
    1. # 使用环境变量生效
    2. source ~/.bash_profile

    3.修改spark配置

    1. cd /opt/software/spark-2.4.8-bin-hadoop2.7/conf
    2. # 复制配置文件
    3. cp spark-env.sh.template spark-env.sh
    4. vim spark-env.sh
    5. # 添加以下内容
    6. export HADOOP_CONF_DIR=/opt/software/hadoop-3.1.3/etc/hadoop
    7. export YARN_CONF_DIR=/opt/software/hadoop-3.1.3/etc/hadoop
    8. export HADOOP_OPTS="-Djava.library.path=/opt/software/hadoop-3.1.3/lib/native"
    1. # 修改hive配置文件
    2. vim /opt/software/hive-3.1.3-bin/conf/hive-site.xml
    3. 添加元数据库配置
    4. <property>
    5. <name>hive.metastore.urisname>
    6. <value>thrift://cmcc01:9083value>
    7. property>
    1. # 创建hive配置文件软连接
    2. ln -s /opt/software/hive-3.1.3-bin/conf/hive-site.xml /opt/software/spark-2.4.8-bin-hadoop2.7/conf
    3. # 复制mysql 驱动包到spark的jar目录下
    4. cp /opt/package/mysql-connector-java-8.0.20.jar /opt/software/spark-2.4.8-bin-hadoop2.7/jars
    1. # 启动metastore
    2. nohup hive --service metastore > ${HIVE_HOME}/logs/metastore.log 2>&1 &
    1. # 添加启动命令到启动脚本
    2. vim /opt/software/start_hiveserver2.sh
    3. # 添加以下内容
    4. #!bin/bash
    5. # 启动hiveserver2
    6. nohup ${HIVE_HOME}/bin/hiveserver2 > ${HIVE_HOME}/logs/hiveserver2.log 2>&1 &
    7. # 启动metastore
    8. nohup hive --service metastore > ${HIVE_HOME}/logs/metastore.log 2>&1 &
    9. # beeline -u jdbc:hive2://cmcc01:10000/default -n root

    4.测试seatunnel

      (1)创建测试文件

    1. vim /opt/software/apache-seatunnel-incubating-2.1.2/config/hive-console.conf
    2. # 添加以下内容
    1. env {
    2. spark.app.name = "SeaTunnel"
    3. spark.executor.instances = 1
    4. spark.executor.cores = 1
    5. spark.num.executors=1
    6. spark.executor.memory = "1g"
    7. execution.parallelism = 1
    8. }
    9. source {
    10. hive {
    11. pre_sql = "select id, name,age from stg.student01"
    12. result_table_name = "student01_log"
    13. }
    14. }
    15. transform {
    16. }
    17. sink {
    18. Console{}
    19. }

      (1)执行测试

    准备hive数据集

    1. CREATE TABLE `stg.student01`
    2. (
    3. `id` int,
    4. `name` string,
    5. `age` int
    6. )
    7. row format delimited fields terminated by ","
    8. STORED AS textfile;
    9. INSERT INTO `stg`.`student01` VALUES (1, '张三', 20),(2, '李四', 21),(3, '五王', 22);
    start-seatunnel-spark.sh --master yarn --deploy-mode client --config /opt/software/apache-seatunnel-incubating-2.1.2/config/hive-console.conf

    测试成功截图

     

  • 相关阅读:
    Java JVM(1) - 走进JVM
    angular 中混用vue解决方案
    mac苹果电脑使用耳机听不到声音
    [正则表达式]php
    蓝桥杯B组C++省赛——飞机降落(DFS)
    【AI】深度学习——循环神经网络
    linux快速安装nodejs与pm2
    2022使用NVIDIA TensorRT 8.0加速深度学习推理(更新)
    Effective C++条款25:考虑写出一个不抛异常的swap函数(Consider support for a non-throwing swap)
    C#学习系列之ListView垂直滚动
  • 原文地址:https://blog.csdn.net/QYmufeng/article/details/126027011