http://localhost:50070/
http://localhost:8088/cluster
➜ /Users/zhaoshuai11/work/hadoop-2.7.3 hadoop fs -ls /
22/07/30 09:43:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
log4j.logger.org.apache.hadoop.util.NativeCodeLoader=ERROR
22/07/30 09:58:59 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
为什么执行mr 首先请求yarn?
数据量大:多机横向发展,理论上无限扩展。
元数据记录:元数据记录-文件及其存储位置信息,快速定位文件位置。
文件分块存储:文件分块存储在不同机器,针对块并行操作提高效率。
副本机制:不同机器设置备份,冗余存储,保障数据安全。









➜ /Users/zhaoshuai11 hadoop fs -put java_error_in_idea_14849.log /itcast
➜ /Users/zhaoshuai11 hadoop fs -ls -h /itcast
Found 4 items
-rw-r--r-- 1 zhaoshuai11 supergroup 21 2022-07-30 09:55 /itcast/hello.txt
-rw-r--r-- 1 zhaoshuai11 supergroup 641.9 K 2022-07-30 09:48 /itcast/java_error_in_idea_11861.log
-rw-r--r-- 1 zhaoshuai11 supergroup 98.9 K 2022-07-30 10:34 /itcast/java_error_in_idea_14849.log
drwxr-xr-x - zhaoshuai11 supergroup 0 2022-07-30 09:59 /itcast/wordcount
➜ /Users/zhaoshuai11 hadoop fs -cat /itcast/hello.txt
hadoop hadoop hadoop
➜ /Users/zhaoshuai11 hadoop fs -get /itcast/wordcount/output/part-r-00000 ./
➜ /Users/zhaoshuai11 ll
-rw-r--r-- 1 zhaoshuai11 staff 9B 7 30 10:41 part-r-00000
➜ /Users/zhaoshuai11 cat part-r-00000
hadoop 3
小文件合并:
➜ /Users/zhaoshuai11 hadoop fs -appendToFile login.sh login-ext.sh /itcast/hello.txt
➜ /Users/zhaoshuai11 hadoop fs -cat /itcast/hello.txt
hadoop hadoop hadoop
#!/usr/bin/expect -f
set host "relay.baidu-int.com"
set username "zhaoshuai11"
set password "zs19961211."
spawn ssh $username@$host
expect "*Please input user's password*" {send "$password\r"}
interact#!/bin/sh
basepath=$(cd `dirname $0`; pwd)
export LC_CTYPE=en_US
#expect脚本所在位置
filepath=$1
if [ -f $filepath ]; then
expect $filepath
else
echo "$filepath not exits"
fi%
➜ /Users/zhaoshuai11
































➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce hadoop jar hadoop-mapreduce-examples-2.7.3.jar pi 2 2
Number of Maps = 2
Samples per Map = 2
Wrote input for Map #0
Wrote input for Map #1
Starting Job
22/07/30 15:38:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/07/30 15:38:57 INFO input.FileInputFormat: Total input paths to process : 2
22/07/30 15:38:58 INFO mapreduce.JobSubmitter: number of splits:2
22/07/30 15:38:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1659144775026_0003
22/07/30 15:38:59 INFO impl.YarnClientImpl: Submitted application application_1659144775026_0003
22/07/30 15:38:59 INFO mapreduce.Job: The url to track the job: http://127.0.0.1:8088/proxy/application_1659144775026_0003/
22/07/30 15:38:59 INFO mapreduce.Job: Running job: job_1659144775026_0003
22/07/30 15:39:09 INFO mapreduce.Job: Job job_1659144775026_0003 running in uber mode : false
22/07/30 15:39:09 INFO mapreduce.Job: map 0% reduce 0%
22/07/30 15:39:15 INFO mapreduce.Job: map 100% reduce 0%
22/07/30 15:39:24 INFO mapreduce.Job: map 100% reduce 100%
22/07/30 15:39:25 INFO mapreduce.Job: Job job_1659144775026_0003 completed successfully
22/07/30 15:39:25 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=50
FILE: Number of bytes written=357444
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=540
HDFS: Number of bytes written=215
HDFS: Number of read operations=11
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=8543
Total time spent by all reduces in occupied slots (ms)=4822
Total time spent by all map tasks (ms)=8543
Total time spent by all reduce tasks (ms)=4822
Total vcore-milliseconds taken by all map tasks=8543
Total vcore-milliseconds taken by all reduce tasks=4822
Total megabyte-milliseconds taken by all map tasks=8748032
Total megabyte-milliseconds taken by all reduce tasks=4937728
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=36
Map output materialized bytes=56
Input split bytes=304
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=56
Reduce input records=4
Reduce output records=0
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=201
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=534773760
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=236
File Output Format Counters
Bytes Written=97
Job Finished in 28.108 seconds
Estimated value of Pi is 4.00000000000000000000
➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce
22/07/30 15:38:57 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 向yarn索要资源


➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /itcast/hello.txt /itcast/hello-count
22/07/30 15:49:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/07/30 15:49:26 INFO input.FileInputFormat: Total input paths to process : 1
22/07/30 15:49:27 INFO mapreduce.JobSubmitter: number of splits:1
22/07/30 15:49:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1659144775026_0004
22/07/30 15:49:27 INFO impl.YarnClientImpl: Submitted application application_1659144775026_0004
22/07/30 15:49:27 INFO mapreduce.Job: The url to track the job: http://127.0.0.1:8088/proxy/application_1659144775026_0004/
22/07/30 15:49:27 INFO mapreduce.Job: Running job: job_1659144775026_0004
22/07/30 15:49:37 INFO mapreduce.Job: Job job_1659144775026_0004 running in uber mode : false
22/07/30 15:49:37 INFO mapreduce.Job: map 0% reduce 0%
22/07/30 15:49:43 INFO mapreduce.Job: map 100% reduce 0%
22/07/30 15:49:49 INFO mapreduce.Job: map 100% reduce 100%
22/07/30 15:49:50 INFO mapreduce.Job: Job job_1659144775026_0004 completed successfully
22/07/30 15:49:50 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=607
FILE: Number of bytes written=238801
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=510
HDFS: Number of bytes written=441
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=3442
Total time spent by all reduces in occupied slots (ms)=3708
Total time spent by all map tasks (ms)=3442
Total time spent by all reduce tasks (ms)=3708
Total vcore-milliseconds taken by all map tasks=3442
Total vcore-milliseconds taken by all reduce tasks=3708
Total megabyte-milliseconds taken by all map tasks=3524608
Total megabyte-milliseconds taken by all reduce tasks=3796992
Map-Reduce Framework
Map input records=18
Map output records=47
Map output bytes=591
Map output materialized bytes=607
Input split bytes=103
Combine input records=47
Combine output records=40
Reduce input groups=40
Reduce shuffle bytes=607
Reduce input records=40
Reduce output records=40
Spilled Records=80
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=122
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=332398592
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=407
File Output Format Counters
Bytes Written=441
➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce
➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce hadoop fs -get /itcast/hello-count ./
➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce/hello-count cat part-r-00000
"$filepath 1
"$password\r"} 1
"*Please 1
"relay.baidu-int.com" 1
"zhaoshuai11" 1
"zs19961211." 1
#!/usr/bin/expect 1
#expect脚本所在位置 1
$0`; 1
$filepath 2
$username@$host 1
-f 2
LC_CTYPE=en_US 1
[ 1
]; 1
`dirname 1
basepath=$(cd 1
echo 1
else 1
exits" 1
expect 2
export 1
fi 1
filepath=$1 1
hadoop 3
host 1
if 1
input 1
interact#!/bin/sh 1
not 1
password 1
password*" 1
pwd) 1
set 3
spawn 1
ssh 1
then 1
user's 1
username 1
{send 1
➜ /Users/zhaoshuai11/work/hadoop-2.7.3/share/hadoop/mapreduce/hello-count





reduce 主动 拉取 mapTask 数据。
























