• 基于Hadoop的数据仓库Hive安装


    基于Hadoop的数据仓库Hive安装

    1、安装Hive

    1.1 下载Hive源程序

    Apache官方:https://www.apache.org/dyn/closer.cgi/hive/

    清华大学镜像:https://mirrors.tuna.tsinghua.edu.cn/apache/hive/

    在Ubuntu中,使用wget命令下载:

    wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz
    
    • 1

    好像失败了(网速问题),算了,还是用Xshell传过来吧!!

    1.2 解压并重命名

    sudo tar -zxvf ./apache-hive-3.1.3-bin.tar.gz -C /usr/local # 解压到 /usr/local 中
    sudo mv apache-hive-3.1.3-bin hive # 重命名为hive
    
    • 1
    • 2

    1.3 修改文件权限

    sudo chown -R hadoop:hadoop hive
    
    • 1

    注意:上面的hadoop:hadoop是用户组和用户名,如果你当前使用用户名user_name登录了Linux系统,则把hadoop替换成user_name。

    1.4 配置环境变量

    为了方便使用,我们把hive命令加入到环境变量中去,使用vim编辑器打开.bashrc文件,命令如下:

    sudo vi ~/.bashrc
    
    • 1

    添加如下内容:

    export HIVE_HOME=/usr/local/hive
    export PATH=$PATH:$HIVE_HOME/bin
    export HADOOP_HOME=/usr/local/hadoop
    
    • 1
    • 2
    • 3

    HADOOP_HOME需要被配置成你系统上Hadoop的安装路径,比如这里是安装在/usr/local/hadoop目录。

    保存退出后,运行如下命令使配置立即生效:

    source ~/.bashrc
    
    • 1

    1.5 配置hive-site.xml

    修改/usr/local/hive/conf下的hive-site.xml,执行如下命令:

    cd /usr/local/hive/conf
    sudo mv hive-default.xml.template hive-default.xml
    
    • 1
    • 2

    上面命令是将hive-default.xml.template重命名为hive-default.xml

    然后,使用vim编辑器新建一个配置文件hive-site.xml,命令如下:

    cd /usr/local/hive/conf
    sudo vi hive-site.xml
    
    • 1
    • 2

    hive-site.xml中添加如下配置信息:

    
    
    
      
        javax.jdo.option.ConnectionURL
        jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true
        JDBC connect string for a JDBC metastore
      
      
        javax.jdo.option.ConnectionDriverName
        com.mysql.jdbc.Driver
        Driver class name for a JDBC metastore
      
      
        javax.jdo.option.ConnectionUserName
        hive
        username to use against metastore database
      
      
        javax.jdo.option.ConnectionPassword
        hive
        password to use against metastore database
      
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24

    2、安装并配置MySQL

    这里我们采用MySQL数据库保存Hive的元数据,而不是采用Hive自带的derby来存储元数据。

    Ubuntu下MySQL的安装参考:Ubuntu安装MySQL及常用操作

    2.1 下载mysql jdbc包

    下载地址:https://dev.mysql.com/downloads/connector/j/

    在Xshell中上传:

    2.2 解压并拷贝

    tar -zxvf mysql-connector-j-8.0.31.tar.gz
    
    • 1

    mysql-connector-j-8.0.31.jar拷贝到/usr/local/hive/lib目录下:

    cd 下载
    cd mysql-connector-j-8.0.31
    sudo cp mysql-connector-j-8.0.31.jar /usr/local/hive/lib
    
    • 1
    • 2
    • 3

    2.3 启动并登陆mysql shell

    service mysql start # 启动mysql服务
    mysql -u root -p  # 登陆shell界面
    
    • 1
    • 2

    2.4 新建hive数据库

    create database hive;
    
    • 1

    这个hive数据库与hive-site.xml中localhost:3306/hive的hive对应,用来保存hive元数据。

    2.5 配置mysql允许hive接入

    grant all on *.* to hive@localhost identified by 'hive'; # 将所有数据库的所有表的所有权限赋给hive用户,后面的hive是配置hive-site.xml中配置的连接密码
    flush privileges; # 刷新mysql系统权限关系表
    
    • 1
    • 2

    会报错!参考博客:grant all on . to hive@localhost identified by ‘hive’; ERROR 1064 (42000): You have an error in yo

    改为如下代码:

    create user 'hive'@'localhost' identified by 'hive';
    grant all on *.* to 'hive'@'localhost';
    flush privileges;
    
    • 1
    • 2
    • 3

    2.6 启动hadoop

    启动hive之前,请先启动hadoop集群:

    cd /usr/local/hadoop
    ./sbin/start-all.sh
    jps # 查看进程(6个为正常)
    
    • 1
    • 2
    • 3

    2.7 启动hive

    cd /usr/local/hive
    ./bin/hive
    
    • 1
    • 2

    尝试一下以下方法:

    ./bin/schematool -dbType mysql -initSchema
    
    • 1

    还是不可以啊!!!【见Bug1】

    2.8 退出hive

    exit;
    
    • 1

    3、Bug1(已解决)

    参考博客:Hive初始化报错Exception in thread “main“ java.lang.NoSuchMethodError: com.google.common.base.

    报错原因: 因为hadoop和hive的两个guava.jar版本不一致

    解决方案:

    (1)删除hive里的guava.jar:

    cd /usr/local/hive/lib
    sudo rm guava-19.0.jar 
    
    • 1
    • 2

    (2)把hadoop里的guava.jar复制到hive里:

    cd /usr/local/hadoop/share/hadoop/common/lib # 进入hadoop
    cp -r guava-27.0-jre.jar /usr/local/hive/lib # 复制到hive中
    
    • 1
    • 2

    (3)初始化hive:

    ./bin/schematool -dbType mysql -initSchema
    
    • 1


    (4)再次启动hive:

    cd /usr/local/hive
    ./bin/hive
    
    • 1
    • 2

    4、Bug2(待解决)

    当启动hadoophive时,都会报出如下错误:

    SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
    SLF4J: Defaulting to no-operation (NOP) logger implementation
    SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
    
    • 1
    • 2
    • 3

    5、Bug3(待解决)

    当启动hive时,会报出如下错误:

    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    WARN DataNucleus.MetaData: Metadata has jdbc-type of null yet this is not valid. Ignored
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12

  • 相关阅读:
    高斯分布可视化
    面试经典 150 题 20 —(数组 / 字符串)— 151. 反转字符串中的单词
    FreeRTOS-链表的源码解析
    双11倒计时3天!凯诘、华扬、多准:“三波段脉冲式促收”策略解读
    微信小程序 - 入门篇
    使用Docker提交参加阿里云比赛
    DocTemplateTool - 可根据模板生成word或pdf文件的工具
    RabbitMQ初步到精通-第十一章-RabbitMQ之常见问题汇总
    webrtc sdp各字段含义
    Chapter6.3:线性系统的校正方法考研参考题
  • 原文地址:https://blog.csdn.net/m0_70885101/article/details/127441100