• hadoop01--Hadoop完全分布式集群搭建


    • 需要的节点:主节点为hadoop1,其他节点分别为hadoop2,hadoop3。
    • 使用jdk文件:jdk-8u144-linux-x64.tar.gz-------------------链接: https://pan.baidu.com/s/1ap1NypdGaD8OoTv5cUEXxw?pwd=av48 提取码: av48 
    • 使用的centos:CentOS-7-x86_64-DVD-1511---------------链接: https://pan.baidu.com/s/1XQ7N2oVf6Z9ouCtpn98EVw?pwd=3i2h 提取码: 3i2h 
    • 使用的软件:VMware15--------------------------链接: https://pan.baidu.com/s/1Rfg6o3xCbkfY90Fkmbcu2A?pwd=7ih5 提取码: 7ih5 复制虚拟机连接工具:xshell---------------------------链接: https://pan.baidu.com/s/1Rfg6o3xCbkfY90Fkmbcu2A?pwd=7ih5 提取码: 7ih5 复制
    • 需要的用户名:HadoopColony,主机名即节点名:hadoop1

    【所有的节点都是在同一台母机上】

    目录

    一,配置Linux集群环境

    1,创建用户

    ​编辑2,添加用户的sudoers权限

    3,修改主机名

    4,关闭防火墙

     5,更改ip

    ​编辑6,创建资源文件

     7,安装jdk

     8,配置jdk环境

    9,克隆虚拟机

    10,修改克隆机的ip及主机名 

    11,配置主机ip映射

    二,搭建hadoop集群

    1,配置集群各节点无密钥登录

    2,配置hadoop

    1)安装hadoop并配置hadoop系统环境变量

    2)配置jdk到hadoop,mapred,yarn上

    ​编辑3,配置hdfs

    1)修改core-site.xml 文件

    ​编辑2)修改hdfs-site.xml文件

    3)修改slaves文件

    4,配置yarn

    1)修改mapred-site.xml文件

    2)修改yarn-site.xml文件

    5,复制hadoop到其他虚拟机

    1)格式化NameNode

    6,启动hadoop集群

    1)查看各节点启动进程

    2)测试HDFS


     接下来我们就需要将我们的虚拟机进行相应的创建用户名,修改主机名,按照配置jdk,克隆虚拟机等操作。 

    一,配置Linux集群环境

    如下,我们现在使用的用户还是root用户,且主机名为localhost,现在我们来将其修改成 用户名为:HadoopColony(即hadoop集群),主机名为hadoop1

    1,创建用户

    2,添加用户的sudoers权限

    3,修改主机名

    先使用su - 用户名:su - HadoopColony。

    之后输入命令:sudo vi /etc/hostname→按“i”进行编辑→使用“←”,“↑”,“↓”,“→”键进行移动光标→输入完成后先按一下“esc”键退出编辑模式,输入“:wq!”即可,如下修改完成:

    然后重启系统,让我们的修改生效 

    之后再一次登录进去就可以看到我们的修改成功了:

    4,关闭防火墙

     5,更改ip

    在更改ip之前,请先阅读这篇文章,事先编辑一下“虚拟网络编辑器”

    linux001--初次体验vmware虚拟机_码到成龚的博客-CSDN博客

     

    如上就是我这台电脑上的IP地址,因此我在给我的虚拟机指定ip时,前三位必须是192.168.37,后面的只要在不要超过255就行。

     修改好网卡之后,保存退出,接着使用命令:sudo service network restart 重启网卡,让我们的修改生效,最后使用ifconfig命令来查看网络信息,如果提示没有该命令的话,就需要去使用yum下载网络工具net-tools,如下:

     如上我们将虚拟机的ip地址都配置好了后,就可以使用虚拟机连接软件来连接我们的虚拟机了

    6,创建资源文件

    1. [HadoopColony@hadoop1 ~]$ cd /opt --转到存放软件的目录
    2. [HadoopColony@hadoop1 opt]$ ls --使用ls查看当前目录有哪些文件
    3. [HadoopColony@hadoop1 opt]$ sudo mkdir softwares modules --创建两个文件夹(目录)
    4. [sudo] password for HadoopColony:
    5. [HadoopColony@hadoop1 opt]$ ll --不仅显示了当前目录下有哪些文件也显示了文件的所有者和所属组
    6. total 0
    7. drwxr-xr-x. 2 root root 6 Sep 5 11:48 modules --现在都是默认root用户
    8. drwxr-xr-x. 2 root root 6 Sep 5 11:48 softwares
    9. [HadoopColony@hadoop1 opt]$ sudo chown -R HadoopColony:HadoopColony /opt/* --修改权限
    10. [HadoopColony@hadoop1 opt]$ ll
    11. total 0
    12. drwxr-xr-x. 2 HadoopColony HadoopColony 6 Sep 5 11:48 modules --修改成功
    13. drwxr-xr-x. 2 HadoopColony HadoopColony 6 Sep 5 11:48 softwares
    14. [HadoopColony@hadoop1 opt]$

     7,安装jdk

    在安装之前我们需要先检查一下我们的虚拟机上是否以及安装了,避免以及安装的jdk对我们的后续操作造成影响 

    1. [HadoopColony@hadoop1 ~]$ rpm -qa | grep java --查看系统上已安装的jdk,如下显示没有
    2. [HadoopColony@hadoop1 ~]$

     如果使用了rpm -qa | grep java 命令后有jdk的存在,可以使用命令卸载掉:sudo rpm -e --nodeps jdk的名字。在确定了没有jdk后,接下来我们需要使用虚拟机与本机之间文件传输工具xftp或者是finalshell。如下我先使用finalshell进行文件的传输

    上传完成后,我们可以看到,在我们的/opt/softwares下存在了jdk文件

     接下来我们就需要将/opt/softwares目录下的jdk压缩包解压到/opt/modules

    1. [HadoopColony@hadoop1 ~]$ cd /opt/softwares/
    2. [HadoopColony@hadoop1 softwares]$ ll
    3. total 181168
    4. -rw-rw-r--. 1 HadoopColony HadoopColony 185515842 Sep 5 12:03 jdk-8u144-linux-x64.tar.gz
    5. [HadoopColony@hadoop1 softwares]$ tar -zxf jdk-8u144-linux-x64.tar.gz -C /opt/modules/ --解压
    6. [HadoopColony@hadoop1 softwares]$ cd ../modules/
    7. [HadoopColony@hadoop1 modules]$ ll --查看/opt/modules目录下是否有我们的jdk文件
    8. total 4
    9. drwxr-xr-x. 8 HadoopColony HadoopColony 4096 Jul 22 2017 jdk1.8.0_144
    10. [HadoopColony@hadoop1 modules]$

     8,配置jdk环境

    相信大家在使用windows操作系统学习Java语言的时候安装jdk后都需要配置jdk的环境变量的,在虚拟机上也不例外。

    1. [HadoopColony@hadoop1 ~]$ sudo vi /etc/profile --编辑etc下的profile文件,编辑好后按esc退出
    2. [sudo] password for HadoopColony:
    3. [HadoopColony@hadoop1 ~]$ tail /etc/profile --最后四行为加进去的内容
    4. unset i
    5. unset -f pathmunge
    6. #配置jdk的环境变量(为hadoop用户)
    7. export JAVA_HOME=/opt/modules/jdk1.8.0_144
    8. export PATH=$PATH:$JAVA_HOME/bin
    9. export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/tool.jar:$JAVA_HOME/lib/dt.jar
    10. [HadoopColony@hadoop1 ~]$

     之后我们需要去刷新一下修改的profile文件

    1. [HadoopColony@hadoop1 ~]$ source /etc/profile --刷新/etc/profile文件
    2. [HadoopColony@hadoop1 ~]$ java -version --显示java版本号等信息
    3. java version "1.8.0_144" --Java安装成功
    4. Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
    5. Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
    6. [HadoopColony@hadoop1 ~]$

     如上配置好了jdk后,我们就可以开始克隆虚拟机了,但是需要注意的是,在克隆虚拟机之前需要先将虚拟机关闭,可以直接在xshell里面输入:sudo poweroff   来进行关机

    9,克隆虚拟机

    选中已经配置好jdk的虚拟机,鼠标右击,找到“管理”→“克隆”,出现如下界面之后,点击"下一步"

     

     

     

     

    如上操作两次,共需要克隆两台虚拟机,如下:

    10,修改克隆机的ip及主机名 

    修改ip地址请参考步骤4 ,修改主机名请参考步骤3。

    1. TYPE=Ethernet
    2. BOOTPROTO=static #引导时不使用协议,静态分配boot协议
    3. DEFROUTE=yes
    4. IPV4_FAILURE_FATAL=no
    5. IPV6INIT=yes
    6. IPV6_AUTOCONF=yes
    7. IPV6_DEFROUTE=yes
    8. IPV6_FAILURE_FATAL=no
    9. NAME=eno16777736 #网卡名
    10. UUID=ef189aea-cdbd-4d76-90f1-e1e1b7a208c6
    11. DEVICE=eno16777736 #物理设备名
    12. ONBOOT=yes #引导时激活设备---开机自启
    13. PEERDNS=yes
    14. PEERROUTES=yes
    15. IPV6_PEERDNS=yes
    16. IPV6_PEERROUTES=yes
    17. IPV6_PRIVACY=no
    18. IPADDR=192.168.37.100 #ip地址
    19. GATEWAY=192.168.37.2 #网关
    20. NETMASK=255.255.255.0 #掩码值
    21. DNS1=114.114.114.114 #第一个代理地址

    修改IP结果图

     修改主机名效果图

    11,配置主机ip映射

    我们的虚拟机不能总是使用ip地址去查找它,也可以使用映射来进行替换查找。这个时候我们就需要去编辑/etc/hosts文件,将我们的ip地址和映射都添加到文件的末尾:

    1. [HadoopColony@hadoop1 ~]$ sudo vi /etc/hosts
    2. [sudo] password for HadoopColony:
    3. 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
    4. ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    5. #添加hadoop集群的ip映射
    6. 192.168.37.100 hadoop1
    7. 192.168.37.101 hadoop2
    8. 192.168.37.102 hadoop3

     接下来我们在三个虚拟机上分别互相ping另外两台虚拟机,如果ping通的话就代表我们的映射成功

     如上测试后,都能够互相ping通。接下来我们开始搭建Hadoop集群。

    二,搭建hadoop集群

    集群之间是需要进行信息交互的,每一台机器都会有密码,如果机器少,业务少的话,我们可以每次都输入密码,但是如果机器多,业务多,我们真的要每一次信息交互都输入密码的话,那实在是难以想象的工作量。因此集群各节点需要配置ssh无密钥登录。

    1,配置集群各节点无密钥登录

    1. [HadoopColony@hadoop1 ~]$ cd ~/.ssh --跳转到ssh目录
    2. [HadoopColony@hadoop1 .ssh]$ ssh-keygen -t rsa --查看密钥和公钥

    使用:ssh-copy-id  主机名 命令将该节点的公钥信息复制并追加到指定节点的授权文件中

    【需要注意的是,自己的公钥信息也要就行复制】

    1. [HadoopColony@hadoop1 ~]$ ssh-copy-id hadoop1 --将hadoop1的公钥信息复制并追加到节点的授权文件中
    2. The authenticity of host 'hadoop1 (192.168.37.100)' can't be established.
    3. ECDSA key fingerprint is 95:7f:45:c8:4c:07:6c:89:e5:2b:6f:ec:1e:fa:7f:7e.
    4. Are you sure you want to continue connecting (yes/no)? es
    5. Please type 'yes' or 'no': yes
    6. /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
    7. /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
    8. HadoopColony@hadoop1's password:
    9. Number of key(s) added: 1
    10. Now try logging into the machine, with: "ssh 'hadoop1'"
    11. and check to make sure that only the key(s) you wanted were added. --添加成功
    12. [HadoopColony@hadoop1 ~]$ ssh-copy-id hadoop2 --将hadoop2的公钥信息复制并追加到节点的授权文件中
    13. The authenticity of host 'hadoop2 (192.168.37.101)' can't be established.
    14. ECDSA key fingerprint is 95:7f:45:c8:4c:07:6c:89:e5:2b:6f:ec:1e:fa:7f:7e.
    15. Are you sure you want to continue connecting (yes/no)? es^H^H^H
    16. Please type 'yes' or 'no': yes
    17. /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
    18. /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
    19. HadoopColony@hadoop2's password:
    20. Number of key(s) added: 1
    21. Now try logging into the machine, with: "ssh 'hadoop2'"
    22. and check to make sure that only the key(s) you wanted were added. --添加成功
    23. [HadoopColony@hadoop1 ~]$ ssh-copy-id hadoop3 --将hadoop3的公钥信息复制并追加到节点的授权文件中
    24. The authenticity of host 'hadoop3 (192.168.37.102)' can't be established.
    25. ECDSA key fingerprint is 95:7f:45:c8:4c:07:6c:89:e5:2b:6f:ec:1e:fa:7f:7e.
    26. Are you sure you want to continue connecting (yes/no)? yes
    27. /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
    28. /usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
    29. HadoopColony@hadoop3's password:
    30. Number of key(s) added: 1
    31. Now try logging into the machine, with: "ssh 'hadoop3'"
    32. and check to make sure that only the key(s) you wanted were added. --添加成功

    剩下的两个从节点按照相同的步骤进行即可:【依旧是那句话:hadoo1也要复制公钥信息到自己hadoo1上】

    hadoop2:

    1. [HadoopColony@hadoop2 .ssh]$ ssh-copy-id hadoop1
    2. Are you sure you want to continue connecting (yes/no)? yes
    3. ...
    4. Now try logging into the machine, with: "ssh 'hadoop1'"
    5. and check to make sure that only the key(s) you wanted were added.
    6. [HadoopColony@hadoop2 .ssh]$ ssh-copy-id hadoop3
    7. Are you sure you want to continue connecting (yes/no)? yes
    8. ...
    9. Now try logging into the machine, with: "ssh 'hadoop3'"
    10. and check to make sure that only the key(s) you wanted were added.

    hadoop3:

    1. [HadoopColony@hadoop3 .ssh]$ ssh-copy-id hadoop2
    2. Are you sure you want to continue connecting (yes/no)? yes
    3. ...
    4. Now try logging into the machine, with: "ssh 'hadoop2'"
    5. and check to make sure that only the key(s) you wanted were added.
    6. [HadoopColony@hadoop3 .ssh]$ ssh-copy-id hadoop1
    7. Are you sure you want to continue connecting (yes/no)? yes
    8. ...
    9. Now try logging into the machine, with: "ssh 'hadoop1'"
    10. and check to make sure that only the key(s) you wanted were added.

    接下来我们开始测试无密钥登录

    1. [HadoopColony@hadoop1 ~]$ ssh hadoop3 --hadoop1→hadoop31
    2. Last login: Mon Sep 5 17:32:22 2022 from hadoop2
    3. [HadoopColony@hadoop3 ~]$ ssh hadoop1 --hadoop3→hadoop13
    4. Last failed login: Mon Sep 5 17:55:16 CST 2022 from hadoop3 on ssh:notty
    5. There was 1 failed login attempt since the last successful login.
    6. Last login: Mon Sep 5 17:44:01 2022 from 192.168.37.1
    7. [HadoopColony@hadoop1 ~]$ ssh hadoop2 --hadoop1→hadoop21
    8. Last login: Mon Sep 5 17:44:25 2022 from 192.168.37.1
    9. [HadoopColony@hadoop2 ~]$ ssh hadoop1 --hadoop2→hadoop12
    10. Last login: Mon Sep 5 17:58:41 2022 from hadoop3
    11. [HadoopColony@hadoop1 ~]$ ssh hadoop3 --hadoop1→hadoop31
    12. Last login: Mon Sep 5 17:45:03 2022 from 192.168.37.1
    13. [HadoopColony@hadoop3 ~]$ ssh hadoop1 --hadoop1→hadoop23
    14. Last login: Mon Sep 5 17:59:05 2022 from hadoop2
    15. [HadoopColony@hadoop1 ~]$

     从上面我们可以看到hadoop1到hadoop2,hadoop3都可以实现无密钥登录

    1. [HadoopColony@hadoop2 ~]$ ssh hadoop1
    2. Last login: Mon Sep 5 18:08:36 2022 from hadoop2
    3. [HadoopColony@hadoop1 ~]$ ssh hadoop2
    4. Last login: Mon Sep 5 18:08:45 2022 from hadoop1
    5. [HadoopColony@hadoop2 ~]$ ssh hadoop3
    6. Last login: Mon Sep 5 17:59:20 2022 from hadoop1
    7. [HadoopColony@hadoop3 ~]$

      hadoop2到hadoop1,hadoop3也可以无密钥登录

    1. [HadoopColony@hadoop3 .ssh]$ ssh hadoop1
    2. Last login: Mon Sep 5 18:08:57 2022 from hadoop2
    3. [HadoopColony@hadoop1 ~]$ ssh hadoop3
    4. Last login: Mon Sep 5 18:09:07 2022 from hadoop2
    5. [HadoopColony@hadoop3 ~]$ ssh hadoop2
    6. Last login: Mon Sep 5 18:09:03 2022 from hadoop1
    7. [HadoopColony@hadoop2 ~]$

    hadoop3到hadoop1,hadoop2也可以实现无密钥登录

    2,配置hadoop

    1)安装hadoop并配置hadoop系统环境变量

    1. [HadoopColony@hadoop1 ~]$ cd /opt/softwares/ --跳转到softwares目录下
    2. [HadoopColony@hadoop1 softwares]$ tar -zxf hadoop-2.8.2.tar.gz -C /opt/modules/ --解压hadoop文件到modules目录
    3. [HadoopColony@hadoop1 softwares]$ sudo vi /etc/profile --修改profile文件,将hadoop的路径配置上去
    4. [HadoopColony@hadoop1 softwares]$ tail -5 /etc/profile --显示配置进去的Hadoop路径
    5. #配置hadoop的环境变量(为hadoop集群)
    6. export HADOOP_HOME=/opt/modules/hadoop-2.8.2
    7. export PATH=$PATH:$HADOOP_HOME/bin
    8. export PATH=$PATH:$HADOOP_HOME/sbin
    9. [HadoopColony@hadoop1 softwares]$ source /etc/profile --刷新文件,使我们的修改生效
    10. [HadoopColony@hadoop1 softwares]$ hadoop version --查看hadoop版本
    11. Hadoop 2.8.2
    12. Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 66c47f2a01ad9637879e95f80c41f798373828fb
    13. Compiled by jdu on 2017-10-19T20:39Z
    14. Compiled with protoc 2.5.0
    15. From source with checksum dce55e5afe30c210816b39b631a53b1d
    16. This command was run using /opt/modules/hadoop-2.8.2/share/hadoop/common/hadoop-common-2.8.2.jar

    2)配置jdk到hadoop,mapred,yarn上

    1. [HadoopColony@hadoop1 ~]$ cd /opt/modules/hadoop-2.8.2/etc/hadoop/ --转到hadoop的安装路径
    2. [HadoopColony@hadoop1 hadoop]$ vi hadoop-env.sh --配置hadoop环境
    3. [HadoopColony@hadoop1 hadoop]$ tail -3 hadoop-env.sh --显示配置信息
    4. export HADOOP_IDENT_STRING=$USER
    5. #配置jdk到hadoop的环境上
    6. export JAVA_HOME=/opt/modules/jdk1.8.0_144
    7. [HadoopColony@hadoop1 hadoop]$ vi mapred-env.sh --配置mapred环境
    8. [HadoopColony@hadoop1 hadoop]$ tail -3 mapred-env.sh
    9. #export HADOOP_MAPRED_NICENESS= #The scheduling priority for daemons. Defaults to 0.
    10. #配置jdk到mapred上
    11. export JAVA_HOME=/opt/modules/jdk1.8.0_144
    12. [HadoopColony@hadoop1 hadoop]$ vi yarn-env.sh --配置yarn环境
    13. [HadoopColony@hadoop1 hadoop]$ tail -3 yarn-env.sh
    14. #配置jdk到yarn上
    15. export JAVA_HOME=/opt/modules/jdk1.8.0_144

    3,配置hdfs

    1)修改core-site.xml 文件

    1. [HadoopColony@hadoop1 hadoop]$ vi core-site.xml
    2. [HadoopColony@hadoop1 hadoop]$ cat core-site.xml
    3. <?xml version="1.0" encoding="UTF-8"?>
    4. <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    5. ...
    6. <configuration>
    7. <property>
    8. <name>fs.defaultFS</name>
    9. <value>hdfs://hadoop1:9000</value>
    10. </property>
    11. <property>
    12. <name>hadoop.tmp.dir</name>
    13. <value>file:/opt/modules/hadoop-2.8.2/tmp</value>
    14. </property>
    15. </configuration>

    2)修改hdfs-site.xml文件

    1. [HadoopColony@hadoop1 hadoop]$ vi hdfs-site.xml
    2. [HadoopColony@hadoop1 hadoop]$ tail -18 hdfs-site.xml
    3. #配置Hadoop集群所需
    4. <property>
    5. <name>dfs.replication</name>
    6. <value>2</value>
    7. </property>
    8. <property>
    9. <name>dfs.permissions.enabled</name>
    10. <value>false</value>
    11. </property>
    12. <property>
    13. <name>dfs.namenode.name.dir</name>
    14. <value>file:/opt/modules/hadoop-2.8.2/tmp/dfs/name</value>
    15. </property>
    16. <property>
    17. <name>dfs.datanode.data.dir</name>
    18. <value>file:/opt/modules/hadoop-2.8.2/tmp/dfs/data</value>
    19. </property>
    20. </configuration>

    3)修改slaves文件

    1. [HadoopColony@hadoop1 hadoop]$ vi slaves
    2. [HadoopColony@hadoop1 hadoop]$ cat slaves
    3. hadoop1
    4. hadoop2
    5. hadoop3

    4,配置yarn

    1)修改mapred-site.xml文件

    1. [HadoopColony@hadoop1 hadoop]$ ls
    2. capacity-scheduler.xml hadoop-metrics2.properties httpfs-signature.secret log4j.properties ssl-client.xml.example
    3. configuration.xsl hadoop-metrics.properties httpfs-site.xml mapred-env.cmd ssl-server.xml.example
    4. container-executor.cfg hadoop-policy.xml kms-acls.xml mapred-env.sh yarn-env.cmd
    5. core-site.xml hdfs-site.xml kms-env.sh mapred-queues.xml.template yarn-env.sh
    6. hadoop-env.cmd httpfs-env.sh kms-log4j.properties mapred-site.xml.template yarn-site.xml
    7. hadoop-env.sh httpfs-log4j.properties kms-site.xml slaves
    8. [HadoopColony@hadoop1 hadoop]$ cp mapred-site.xml.template mapred-site.xml
    9. [HadoopColony@hadoop1 hadoop]$ vi mapred-site.xml
    10. [HadoopColony@hadoop1 hadoop]$ tail -8 mapred-site.xml
    11. <configuration>
    12. #为hadoop集群进行的配置
    13. <property>
    14. <name>mapreduce.framework.name</name>
    15. <value>yarn</value>
    16. </property>
    17. </configuration>

    2)修改yarn-site.xml文件

    1. [HadoopColony@hadoop1 hadoop]$ vi yarn-site.xml
    2. [HadoopColony@hadoop1 hadoop]$ tail yarn-site.xml
    3. #为Hadoop集群配置2022/9/5
    4. <property>
    5. <name>yarn.nodemanager.aux-services</name>
    6. <value>mapreduce_shuffle</value>
    7. </property>
    8. <property>
    9. <name>yarn.resourcemanager.address</name>
    10. <value>hadoop1:8032</value>
    11. </property>
    12. </configuration>

    5,复制hadoop到其他虚拟机

    1. [HadoopColony@hadoop1 modules]$ scp -r hadoop-2.8.2 HadoopColony@hadoop2:/opt/modules/
    2. [HadoopColony@hadoop1 modules]$ scp -r hadoop-2.8.2 HadoopColony@hadoop3:/opt/modules/

    (这个需要2~3分,可去泡一杯咖啡🔥)

     如上,安全复制hadoop文件成功,接下来我们需要格式化我们的NameNode。

    1)格式化NameNode

    1. [HadoopColony@hadoop1 modules]$ hadoop namenode -format
    2. ...
    3. 22/09/05 20:56:48 INFO common.Storage: Storage directory /opt/modules/hadoop-2.8.2/tmp/dfs/name has been successfully formatted. --出现这一行表示格式化成功
    4. ...

    6,启动hadoop集群

    1. [HadoopColony@hadoop1 modules]$ cd /opt/modules/hadoop-2.8.2/tmp/dfs/name/current --查看初始化后的文件
    2. [HadoopColony@hadoop1 current]$ ls --(里面存储了hdfs的元数据。)
    3. fsimage_0000000000000000000 fsimage_0000000000000000000.md5 seen_txid VERSION
    4. [HadoopColony@hadoop1 current]$ start-all.sh

    或者是在sbin目录下打开集群: 

    1)查看各节点启动进程

    hadoop1:

    1. [HadoopColony@hadoop1 current]$ cd /opt/modules/hadoop-2.8.2/sbin
    2. [HadoopColony@hadoop1 sbin]$ jps
    3. 30483 ResourceManager
    4. 30614 NodeManager
    5. 30235 SecondaryNameNode
    6. 30077 DataNode
    7. 29966 NameNode
    8. 30910 Jps

    hadoop2:

    1. [HadoopColony@hadoop2 ~]$ jps
    2. 13010 DataNode
    3. 13250 Jps
    4. 13115 NodeManager
    5. [HadoopColony@hadoop2 ~]$

    hadoop3:

    1. [HadoopColony@hadoop3 ~]$ jps
    2. 10016 NodeManager
    3. 9910 DataNode
    4. 10153 Jps
    5. [HadoopColony@hadoop3 ~]$

    如上就配置成功了,如果不放心的话可以再使用hdfs来进行查看

    2)测试HDFS

    1. [HadoopColony@hadoop1 ~]$ hdfs dfs -mkdir /hello2022
    2. [HadoopColony@hadoop1 ~]$ hdfs dfs -ls /
    3. Found 1 items
    4. drwxr-xr-x - HadoopColony supergroup 0 2022-09-05 22:12 /hello2022
    5. [HadoopColony@hadoop1 ~]$

    以下为在外部查看,即使用浏览器的方式:

    可以直接使用:http://ip地址:50070

     打开浏览器→输入‘http://ip地址:50070’→回车→找到‘Utilities’点击→选择‘Browse the file system ’→就可以看到我们新建的文件了。

    http://192.168.37.100:50070/

    如果觉得使用ip地址太长太麻烦了也可以使用:http://主机名:50070【但是这个需要提前在母机的hosts文件里面进行IP地址映射才能够使用】

    该hosts文件位于:C:\Windows\System32\drivers\etc\hosts

    【值得注意的是如果我们使用普通用户的形式打开该文件并进行编辑的话是无用的,没法保存,因此我们需要先使用“管理员身份”打开我们的记事本】


     如上就是配置一个主节点,两个子节点的hadoop集群,如果有问题的请在评论区回复。

  • 相关阅读:
    ASP.NET 网页JS打印
    一文快速带你了解 KMM 、 Compose 和 Flutter 的现状
    【深入浅出 Yarn 架构与实现】5-3 Yarn 调度器资源抢占模型
    倍福PLC和C#通过ADS通信传输String类型
    【异常】springboot集成@Cacheable缓存乱码的问题解决方案
    react项目配置(类组件、函数式组件)
    前端常见的十种布局
    Linux-命令行命令
    DTcloud 装饰器
    嵌入式分享合集60
  • 原文地址:https://blog.csdn.net/weixin_53046747/article/details/126697644