• 0002 - Hadoop集群搭建


    虚拟机准备

    创建虚拟机

    安装3台同属于一个局域网的虚拟机,详细教程可以Google.

    我这里创建了三台centos虚拟机,IP为192.168.1.101~192.168.1.103
    在这里插入图片描述

    配置/etc/hosts

    同时配置/etc/hosts,使三台主机可以通过主机名互相ping通
    在这里插入图片描述

    关闭防火墙

    执行下面命令关闭防火墙,避免集群通信被防火墙阻拦

    systemctl disable firewalld.service
    
    • 1

    服务器间ssh免密登陆

    • 在每一个服务器上生成公钥和密钥
      ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
      
      • 1
    • 将公钥拷贝至其他服务器的 /root/.ssh/authorized_keys 文件.

    JDK安装

    获取配套软件资源点我

    解压jdk安装包解压完成
    配置JDK环境变量
    在这里插入图片描述
    插入如下配置

    export JAVA_HOME=/home/software/jdk1.8.0_301
    export PATH=.:$PATH:$JAVA_HOME/bin
    
    • 1
    • 2

    验证JDK安装是否成功

    HADOOP 安装

    解压&配置环境变量

    获取配套软件资源点我

    解压HADOOP
    解压完成
    配置HADOOP环境变量
    在这里插入图片描述
    插入如下配置

    export HADOOP_HOME=/home/software/hadoop-2.7.4
    export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    • 1
    • 2

    HADOOP配置文件

    配置文件目录:/home/software/hadoop-2.7.4/etc/hadoop

    hadoop-env.sh

    配置JAVA_HOME如下

    export JAVA_HOME=/home/software/jdk1.8.0_301
    
    • 1

    hdfs-site.xml

    创建数据存储目录
    创建数据存储目录
    配置node1上的hdfs-site.xml如下

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>dfs.datanode.data.dir</name>
        <value>file://home/software/hadoop-2.7.4/data/datanode</value>
      </property>
      <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///home/software/hadoop-2.7.4/data/namenode</value>
      </property>
      <!-- 
      这个配置是错误的,不需要配置。原因看下面链接
      https://stackoverflow.com/questions/34410181/why-i-cant-access-http-hadoop-master50070-when-i-define-dfs-namenode-http-ad
    
      <property>
        <name>dfs.namenode.http-address</name>
        <value>node1:50070</value>
      </property>
      -->
      <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>node2:50090</value>
      </property>
      <property>
        <name>dfs.replication</name>
        <value>1</value>
      </property>
    </configuration>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45

    yarn-site.xml

    <?xml version="1.0"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    <configuration>
      <!-- Site specific YARN configuration properties -->
      <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
      </property>
      <property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
      </property>
      <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>node1:8025</value>
      </property>
      <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>node1:8030</value>
      </property>
      <property>
        <name>yarn.resourcemanager.address</name>
        <value>node1:8050</value>
      </property>
    </configuration>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37

    core-site.xml

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    <!--
      Licensed under the Apache License, Version 2.0 (the "License");
      you may not use this file except in compliance with the License.
      You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      See the License for the specific language governing permissions and
      limitations under the License. See accompanying LICENSE file.
    -->
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>fs.defaultFS</name>
        <value>hdfs://node1/</value>
      </property>
      <property>
        <name>ha.zookeeper.quorum</name>
        <value>node1:2181,node2:2181,node3:2181</value>
      </property>
    </configuration>
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28

    slaves

    slaves

    将node1 上的配置拷贝到其他节点上

    首先在其他节点上解压&配置环境变量,再在node1上执行如下命令同步配置至其他节点

    scp /home/software/hadoop-2.7.4/etc/hadoop/* root@node2:/home/software/hadoop-2.7.4/etc/hadoop/
    scp /home/software/hadoop-2.7.4/etc/hadoop/* root@node3:/home/software/hadoop-2.7.4/etc/hadoop/
    
    • 1
    • 2

    格式化文件系统

    在主节点(node1)上执行下面命令,格式化文件系统

    hdfs namenode -format
    
    • 1

    HADOOP集群启动&停止

    启动

    在node1上执行下面命令启动HADOOP集群

    start-all.sh
    
    • 1

    启动成功日志
    查看每个主节点上的HADOOP进程
    每个主节点上的HADOOP进程

    查看页面

    查看页面

    停止集群

    stop-all.sh
    
    • 1

    在这里插入图片描述

  • 相关阅读:
    Java Utils工具类大全
    Python - 生成二维码、条形码
    django定时任务(django-crontab)
    mysql死锁示例
    java刷题day 04
    算法通关村第三关-青铜挑战数组专题
    LabVIEW调用MathScript Window
    模块、服务、接口命名示例
    《网络安全笔记》第十一章:物理层
    react hooks编程规范(内部使用)
  • 原文地址:https://blog.csdn.net/jintaohahahaha/article/details/125470800