• 0002 - Hadoop集群搭建


    虚拟机准备

    创建虚拟机

    安装3台同属于一个局域网的虚拟机,详细教程可以Google.

    我这里创建了三台centos虚拟机,IP为192.168.1.101~192.168.1.103
    在这里插入图片描述

    配置/etc/hosts

    同时配置/etc/hosts,使三台主机可以通过主机名互相ping通
    在这里插入图片描述

    关闭防火墙

    执行下面命令关闭防火墙,避免集群通信被防火墙阻拦

    systemctl disable firewalld.service
    
    • 1

    服务器间ssh免密登陆

    • 在每一个服务器上生成公钥和密钥
      ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
      
      • 1
    • 将公钥拷贝至其他服务器的 /root/.ssh/authorized_keys 文件.

    JDK安装

    获取配套软件资源点我

    解压jdk安装包解压完成
    配置JDK环境变量
    在这里插入图片描述
    插入如下配置

    export JAVA_HOME=/home/software/jdk1.8.0_301
    export PATH=.:$PATH:$JAVA_HOME/bin
    
    • 1
    • 2

    验证JDK安装是否成功

    HADOOP 安装

    解压&配置环境变量

    获取配套软件资源点我

    解压HADOOP
    解压完成
    配置HADOOP环境变量
    在这里插入图片描述
    插入如下配置

    export HADOOP_HOME=/home/software/hadoop-2.7.4
    export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    • 1
    • 2

    HADOOP配置文件

    配置文件目录:/home/software/hadoop-2.7.4/etc/hadoop

    hadoop-env.sh

    配置JAVA_HOME如下

    export JAVA_HOME=/home/software/jdk1.8.0_301
    
    • 1

    hdfs-site.xml

    创建数据存储目录
    创建数据存储目录
    配置node1上的hdfs-site.xml如下

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>dfs.datanode.data.dir</name>
        <value>file://home/software/hadoop-2.7.4/data/datanode</value>
      </property>
      <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///home/software/hadoop-2.7.4/data/namenode</value>
      </property>
      <!-- 
      这个配置是错误的,不需要配置。原因看下面链接
      https://stackoverflow.com/questions/34410181/why-i-cant-access-http-hadoop-master50070-when-i-define-dfs-namenode-http-ad
    
      <property>
        <name>dfs.namenode.http-address</name>
        <value>node1:50070</value>
      </property>
      -->
      <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node2:50090</value>
      </property>
      <property>
        <name>dfs.replication</name>
        <value>1</value>
      </property>
    </configuration>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45

    yarn-site.xml

    <?xml version="1.0"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    <configuration>
      <!-- Site specific YARN configuration properties -->
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
      <property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
      </property>
      <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>node1:8025</value>
      </property>
      <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>node1:8030</value>
      </property>
      <property>
        <name>yarn.resourcemanager.address</name>
        <value>node1:8050</value>
      </property>
    </configuration>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37

    core-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://node1/</value>
      </property>
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>node1:2181,node2:2181,node3:2181</value>
      </property>
    </configuration>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28

    slaves

    slaves

    将node1 上的配置拷贝到其他节点上

    首先在其他节点上解压&配置环境变量,再在node1上执行如下命令同步配置至其他节点

    scp /home/software/hadoop-2.7.4/etc/hadoop/* root@node2:/home/software/hadoop-2.7.4/etc/hadoop/
    scp /home/software/hadoop-2.7.4/etc/hadoop/* root@node3:/home/software/hadoop-2.7.4/etc/hadoop/
    
    • 1
    • 2

    格式化文件系统

    在主节点(node1)上执行下面命令,格式化文件系统

    hdfs namenode -format
    
    • 1

    HADOOP集群启动&停止

    启动

    在node1上执行下面命令启动HADOOP集群

    start-all.sh
    
    • 1

    启动成功日志
    查看每个主节点上的HADOOP进程
    每个主节点上的HADOOP进程

    查看页面

    查看页面

    停止集群

    stop-all.sh
    
    • 1

    在这里插入图片描述

  • 相关阅读:
    小波系数等值线图和小波方差图绘制教学
    KingbaseES集群管理维护案例之---备库checkpoint分析
    人大女王大学金融硕士——人生的每一刻,都是在为自己的明天铺垫
    AI智能语音机器人开源源码系统二次开发各版本部署
    【深度学习】卷积层填充和步幅以及其大小关系
    无涯教程-Flutter - 简介
    GBase 8a修改端口
    HAProxy理论+实验
    【云原生 | 从零开始学Kubernetes】五、Kubernetes核心技术Pod
    ICCV 2023|Occ2Net,一种基于3D 占据估计的有效且稳健的带有遮挡区域的图像匹配方法...
  • 原文地址:https://blog.csdn.net/jintaohahahaha/article/details/125470800