安装spark2.1.1

安装spark

1.安装

1.上传安装包

2.解压

2.配置

1.配置slaves

2.修改spark-env.sh文件

3.配置spark-config.sh文件

1.修改spark-default.conf.template名称

2.修改spark-default.conf文件，开启Log：

3.修改spark-env.sh文件，添加如下配置：

安装spark

1.安装

1.上传安装包

spark-2.1.1-bin-hadoop2.7.tgz，官网下载

2.解压

[root@hadoop01 software]# tar -zxvf spark-2.1.1-bin-hadoop2.7.tgz -C /opt/module/

2.配置

1.配置slaves


[root@hadoop01 module]# mv spark-2.1.1-bin-hadoop2.7 spark
[root@hadoop01 conf]# mv slaves.template slaves 
[root@hadoop01 conf]# vim slaves 
 
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
 
# A Spark Worker will be started on each of the machines listed below.
hadoop01
hadoop02
hadoop03

2.修改spark-env.sh文件


[root@hadoop01 conf]# vim spark-env.sh
SPARK_MASTER_HOST=hadoop01
SPARK_MASTER_PORT=7077

3.配置spark-config.sh文件


[root@hadoop01 sbin]# vim spark-config.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144

3.同步其他集群

[root@hadoop01 module]$ xsync spark/

4.启动


[root@hadoop01 spark]$ sbin/start-all.sh
[root@hadoop01 spark]$ util.sh 
================root@hadoop01================
3330 Jps
3238 Worker
3163 Master
================root@hadoop02================
2966 Jps
2908 Worker
================root@hadoop03================
2978 Worker
3036 Jps

5.查看UI页面，端口号8080

6.JobHistoryServer配置

1.修改spark-default.conf.template名称

[root@hadoop01 conf]$ mv spark-defaults.conf.template spark-defaults.conf

2.修改spark-default.conf文件，开启Log：

注意：HDFS上的目录需要提前存在。

没有就创建目录

[root@hadoop01 conf]# hdfs dfs -mkdir directory


[root@hadoop01 conf]# vim spark-defaults.conf 
 
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#    http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
 
# Default system properties included when running spark-submit.
# This is useful for setting default environmental settings.
 
# Example:
# spark.master                     spark://master:7077
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
# spark.serializer                 org.apache.spark.serializer.KryoSerializer
# spark.driver.memory              5g
# spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://hadoop01:9000/directory

3.修改spark-env.sh文件，添加如下配置：


[root@hadoop01 conf]# vim spark-env.sh 
export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=18080 
-Dspark.history.retainedApplications=30 
-Dspark.history.fs.logDirectory=hdfs://hadoop01:9000/directory"

4.同步到其他集群

[root@hadoop01 conf]# xsync /opt/module/spark/conf

5.启动历史服务

[root@hadoop01 spark]# sbin/start-history-server.sh

6.再次执行任务


[root@hadoop102 spark]$ bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://hadoop102:7077 \
--executor-memory 1G \
--total-executor-cores 2 \
./examples/jars/spark-examples_2.11-2.1.1.jar \
100

7.查看历史服务，端口号18080

相关阅读:
molecular-graph-bert（一）
从read 系统调用到 C10M 问题
Windows11搭建kafka-python环境
Ruby 条件判断
嵌入式单片机项目开发的基本思想分享
【概率论与数理统计】【线性代数】计算机保研复习
matlab展示两个向量之间的差异
Vue基础案例-成绩显示
【通关MySQL】MySQL增删改查(CRUD)详解
frps内网穿透

原文地址：https://blog.csdn.net/m0_55834564/article/details/126898643