• Linux centos7安装部署KETTLE-9.3.0


    环境说明:

    主机名:cmcc01为例 

    操作系统:centos7

    安装部署软件版本部署方式
    centos7
    zookeeperzookeeper-3.4.10伪分布式
    hadoophadoop-3.1.3伪分布式
    hivehive-3.1.3-bin伪分布式
    clickhouse21.11.10.1-2单节点多实例
    dolphinscheduler3.0.0单节点
    kettlepdi-ce-9.3.0.0单节点
    sqoopsqoop-1.4.7单节点
    seatunnelseatunnel-incubating-2.1.2单节点
    sparkspark-2.4.8单节点

    整合mysql+hive

    1. 下载kettle

    官网:https://sourceforge.net/projects/pentaho/files/

    2.解压

    unzip /opt/package/pdi-ce-9.3.0.0-428.zip -d /opt/software/

    3、配置java环境变量

    1. vim ~/.bash_profile
    2. # 添加以下内容
    3. # JDK
    4. export JAVA_HOME=/opt/software/jdk1.8.0_321
    5. export PATH=$PATH:${JAVA_HOME}/bin

    使配置生效

    source /etc/profile

    4.给同组用户赋权

    chmod g+x /opt/software/data-integration/kitchen.sh

    5.执行命令

    1. [root@cmcc01 data-integration]#
    2. [root@cmcc01 data-integration]#
    3. [root@cmcc01 data-integration]# ./kitchen.sh
    4. #######################################################################
    5. WARNING: no libwebkitgtk-1.0 detected, some features will be unavailable
    6. Consider installing the package with apt-get or yum.
    7. e.g. 'sudo apt-get install libwebkitgtk-1.0-0'
    8. #######################################################################
    9. Options:
    10. -rep = Repository name
    11. -user = Repository username
    12. -trustuser = !Kitchen.ComdLine.RepUsername!
    13. -pass = Repository password
    14. -job = The name of the job to launch
    15. -dir = The directory (dont forget the leading /)
    16. -file = The filename (Job XML) to launch
    17. -level = The logging level (Basic, Detailed, Debug, Rowlevel, Error, Minimal, Nothing)
    18. -logfile = The logging file to write to
    19. -listdir = List the directories in the repository
    20. -listjobs = List the jobs in the specified directory
    21. -listrep = List the available repositories
    22. -norep = Do not log into the repository
    23. -version = show the version, revision and build date
    24. -param = Set a named parameter =. For example -param:FILE=customers.csv
    25. -listparam = List information concerning the defined parameters in the specified job.
    26. -export = Exports all linked resources of the specified job. The argument is the name of a ZIP file.
    27. -custom = Set a custom plugin specific option as a String value in the job using =, for example: -custom:COLOR=Red
    28. -maxloglines = The maximum number of log lines that are kept internally by Kettle. Set to 0 to keep all rows (default)
    29. -maxlogtimeout = The maximum age (in minutes) of a log line while being kept internally by Kettle. Set to 0 to keep all rows indefinitely (default)
    30. [root@cmcc01 data-integration]#
    31. [root@cmcc01 data-integration]#

    此处有告警 

    6.解决告警

    1. wget ftp://ftp.pbone.net/mirror/ftp5.gwdg.de/pub/opensuse/repositories/home:/matthewdva:/build:/EPEL:/el7/RHEL_7/x86_64/webkitgtk-2.4.9-1.el7.x86_64.rpm
    2. yum -y install webkitgtk-2.4.9-1.el7.x86_64.rpm
    3. # 再次执行命令,告警消除
    4. [root@cmcc01 package]#
    5. [root@cmcc01 package]# /opt/software/data-integration/kitchen.sh
    6. Options:
    7. -rep = Repository name
    8. -user = Repository username
    9. -trustuser = !Kitchen.ComdLine.RepUsername!
    10. -pass = Repository password
    11. -job = The name of the job to launch
    12. -dir = The directory (dont forget the leading /)
    13. -file = The filename (Job XML) to launch
    14. -level = The logging level (Basic, Detailed, Debug, Rowlevel, Error, Minimal, Nothing)
    15. -logfile = The logging file to write to
    16. -listdir = List the directories in the repository
    17. -listjobs = List the jobs in the specified directory
    18. -listrep = List the available repositories
    19. -norep = Do not log into the repository
    20. -version = show the version, revision and build date
    21. -param = Set a named parameter =. For example -param:FILE=customers.csv
    22. -listparam = List information concerning the defined parameters in the specified job.
    23. -export = Exports all linked resources of the specified job. The argument is the name of a ZIP file.
    24. -custom = Set a custom plugin specific option as a String value in the job using =, for example: -custom:COLOR=Red
    25. -maxloglines = The maximum number of log lines that are kept internally by Kettle. Set to 0 to keep all rows (default)
    26. -maxlogtimeout = The maximum age (in minutes) of a log line while being kept internally by Kettle. Set to 0 to keep all rows indefinitely (default)
    27. [root@cmcc01 package]#
    28. [root@cmcc01 package]#

    7.测试

    1. # 执行转换
    2. # 编写测试转换,执行如下命令即可
    3. /opt/software/data-integration/pan.sh -file=/opt/kettle-spoon/ktr/test/test1.ktr log=test1.log
    4. # 执行job
    5. /opt/software/data-integration/kitchen.sh -file=/opt/kettle-spoon/ktr/test/SechuldUpdate.kjb log=timeLogUpdate.log

    8.kettle整合mysql

    此时当前用户下会多一个文件: ~/.kettle/kettle.properties

    如果没有可自行创建

    (1). 设置MySQL连接信息:

    1. vim ~/.kettle/kettle.properties
    2. 添加以下内容:
    3. ##MYSQL
    4. MYSQL_HOST=localhost
    5. MYSQL_DB_PORT=3306
    6. MYSQL_DB_USER=root
    7. MYSQL_DB_PASSWORD=123qwe
    8. MYSQL_DB_NAME=flinkcdc

     (2)复制驱动到data-integration/lib下

    cp /opt/package/mysql-connector-java-8.0.20.jar /opt/software/data-integration/lib

     (3)创建数据连接测试

     (4)创建job kettle_job_test.kjb

     

     (5)上传job执行

    1. # 运行job
    2. /opt/software/data-integration/kitchen.sh -file=/opt/package/kettle_job_test.kjb

     9.kettle整合hive

    1. # 创建hive jar包软连接
    2. ln -s /opt/software/hive-3.1.3-bin/lib/*.jar /opt/software/data-integration/lib

    可能会报错:File exists,可忽略

     创建job测试

     

     执行job

    /opt/software/data-integration/kitchen.sh -file=/opt/package/kettle_job_hive_test.kjb

  • 相关阅读:
    设计简单的起始页Jump
    uniapp -从头开始开发小程序流程
    华为OD:IPv4地址转换成整数
    电脑Win11安装Autocad出现错误要如何处理
    (WebFlux)004、WebFilter踩坑记录
    3.事务篇【mysql高级】
    《系统架构设计师教程》 第二章:计算机与网络基础知识
    一文详解GaussDB(DWS) 的并发管控和内存管控
    Spring面试题12:Spring中IOC的优缺点是什么?IOC依赖注入方式有哪些
    图形学-着色(Blinn-Phong模型)
  • 原文地址:https://blog.csdn.net/QYmufeng/article/details/125866729