安装3台同属于一个局域网的虚拟机,详细教程可以Google.
我这里创建了三台centos虚拟机,IP为192.168.1.101~192.168.1.103

同时配置/etc/hosts,使三台主机可以通过主机名互相ping通

执行下面命令关闭防火墙,避免集群通信被防火墙阻拦
systemctl disable firewalld.service
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
/root/.ssh/authorized_keys 文件.获取配套软件资源点我


配置JDK环境变量

插入如下配置
export JAVA_HOME=/home/software/jdk1.8.0_301
export PATH=.:$PATH:$JAVA_HOME/bin

获取配套软件资源点我


配置HADOOP环境变量

插入如下配置
export HADOOP_HOME=/home/software/hadoop-2.7.4
export PATH=.:$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
配置文件目录:/home/software/hadoop-2.7.4/etc/hadoop
配置JAVA_HOME如下
export JAVA_HOME=/home/software/jdk1.8.0_301
创建数据存储目录

配置node1上的hdfs-site.xml如下
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file://home/software/hadoop-2.7.4/data/datanode</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/software/hadoop-2.7.4/data/namenode</value>
</property>
<!--
这个配置是错误的,不需要配置。原因看下面链接
https://stackoverflow.com/questions/34410181/why-i-cant-access-http-hadoop-master50070-when-i-define-dfs-namenode-http-ad
<property>
<name>dfs.namenode.http-address</name>
<value>node1:50070</value>
</property>
-->
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node2:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>node1:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>node1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>node1:8050</value>
</property>
</configuration>
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://node1/</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node1:2181,node2:2181,node3:2181</value>
</property>
</configuration>

首先在其他节点上解压&配置环境变量,再在node1上执行如下命令同步配置至其他节点
scp /home/software/hadoop-2.7.4/etc/hadoop/* root@node2:/home/software/hadoop-2.7.4/etc/hadoop/
scp /home/software/hadoop-2.7.4/etc/hadoop/* root@node3:/home/software/hadoop-2.7.4/etc/hadoop/
在主节点(node1)上执行下面命令,格式化文件系统
hdfs namenode -format
在node1上执行下面命令启动HADOOP集群
start-all.sh

查看每个主节点上的HADOOP进程


stop-all.sh
