• 【备忘/shell】hadoop 常见shell 与相关进程操作命令 ing


    本文介绍hdfs的相关操作shell,Hadoop格式化、启动、以及其他常见的进程操作命令,以便在操作hadoop时能够快速查看命令,起到备忘的作用。

    一. HDFS shell

    # 1. 上传文件
    hdfs dfs -put ./wuguo.txt /sanguo
    
    # 2. 追加文件
    hdfs dfs -appendToFile liubei.txt /sanguo/shuguo.txt
    
    #3. 下载文件
    hdfs dfs -get /sanguo/shuguo.txt ./shuguo2.txt
    
    #3. 修改文件所属权限
    hdfs dfs  -chmod 666  /sanguo/shuguo.txt
    hdfs dfs  -chown  atguigu:atguigu   /sanguo/shuguo.txt
    
    #4. 文件大小
    hdfs dfs -du -s -h /jinguo
    
    说明:27表示文件大小;81表示27*3个副本;/jinguo表示查看的目录
    
    #5. 递归删除
    hdfs dfs -rm -r /sanguo
    #删除
    hdfs dfs -rm /sanguo/shuguo.txt 
    
    # 展示文件大小
    hadoop fs -du -h hdfs://nn1node:9000/file_path/test_performance_10m
    6.7 G  20.0 G  hdfs://path/test_performance_10m/pday=20230101
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26

    官网:shell 官网

     
     

    二. Yarn常见命令

    1. application操作

    1)列出所有Application:yarn application -list

    yarn application -list
    
    2021-02-06 10:21:19,238 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
    Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):0
                    Application-Id      Application-Name      Application-Type        User       Queue               State         Final-State         Progress                        Tracking-URL
    
    • 1
    • 2
    • 3
    • 4
    • 5

    2)根据Application状态获取app列表:yarn application -list -appStates

    (所有状态:ALL、NEW、NEW_SAVING、SUBMITTED、ACCEPTED、RUNNING、FINISHED、FAILED、KILLED)

    yarn application -list -appStates FINISHED
    
    2021-02-06 10:22:20,029 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
    Total number of applications (application-types: [], states: [FINISHED] and tags: []):1
                    Application-Id      Application-Name      Application-Type        User       Queue               State         Final-State         Progress                        Tracking-URL
    application_1612577921195_0001            word count             MAPREDUCE     atguigu     default            FINISHED           SUCCEEDED             100% http://hadoop102:19888/jobhistory/job/job_1612577921195_0001
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6

    3)Kill掉Application

    yarn application -kill application_1612577921195_0001
    
    2021-02-06 10:23:48,530 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
    Application application_1612577921195_0001 has already finished
    
    • 1
    • 2
    • 3
    • 4

     

    2. 查看日志

    1)查询Application日志:

    yarn logs -applicationId <ApplicationId>
    
    • 1

    2)查询Container日志:yarn logs -applicationId < AppId> -containerId < ContainerId >

    yarn logs -applicationId application_1612577921195_0001 -containerId container_1612577921195_0001_01_000001
    
    • 1

     

    3. 查看container

    1)列出所有Container:yarn container -list < ApplicationAttemptId >

    yarn container -list appattempt_1612577921195_0001_000001
    
    2021-02-06 10:28:41,396 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
    Total number of containers :0
                      Container-Id	          Start Time	         Finish Time	               State	                Host	   Node Http Address	
    
    • 1
    • 2
    • 3
    • 4
    • 5

    2)打印Container状态: yarn container -status < ContainerId >

    yarn container -status container_1612577921195_0001_01_000001
    
    2021-02-06 10:29:58,554 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
    Container with id 'container_1612577921195_0001_01_000001' doesn't exist in RM or Timeline Server.
    
    • 1
    • 2
    • 3
    • 4

    注:只有在任务跑的途中才能看到container的状态

     

    4. node状态

    列出所有节点:yarn node -list -all

    yarn node -list -all
    
    2021-02-06 10:31:36,962 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8032
    Total Nodes:3
             Node-Id	     Node-State	Node-Http-Address	Number-of-Running-Containers
     hadoop103:38168	        RUNNING	   hadoop103:8042	                           0
     hadoop102:42012	        RUNNING	   hadoop102:8042	                           0
     hadoop104:39702	        RUNNING	   hadoop104:8042
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

     

    5. 更新队列

    更新队列:yarn rmadmin -refreshQueues

    yarn rmadmin -refreshQueues
    2021-02-06 10:32:03,331 INFO client.RMProxy: Connecting to ResourceManager at hadoop103/192.168.10.103:8033
    
    • 1
    • 2

     

    三. 启停命令

    1. 启停方式一

    启动

    sbin/start-dfs.sh
    sbin/start-yarn.sh
    
    • 1
    • 2

    停止

    sbin/yarn-dfs.sh
    sbin/yarn-yarn.sh
    
    • 1
    • 2

     

    2. 启停方式二

    hdfs --daemon start/stop journalnode/namenode/datanode/zkfc
    yarn --daemon start/stop resourcemanager/nodemanager
    
    • 1
    • 2

     

    四. 一些场景命令ing

    总结一些常见的hadoop场景

    1. 关闭安全模式

    hadoop dfsadmin -safemode leave
    
    • 1

    2. 查看那个节点是activeNamenode

    hdfs haadmin -getServiceState nn1
    #active
    hdfs haadmin -getServiceState nn2
    # standby
    
    • 1
    • 2
    • 3
    • 4
  • 相关阅读:
    Java对象的相等判定问题与equals方法解析
    面向对象05:创建对象内存分析
    软件项目验收测试范围和流程,这些你都知道吗?
    AMD大规模裁员15%? 赔偿N+7?官方回应来了 | 百能云芯
    开发的装机环境--java技术的概念大全
    剑指offer 38:字符串的排列
    [附源码]Python计算机毕业设计Django路政管理信息系统
    单元测试到底是什么?应该怎么做?
    20220924 Windows平台用MinGW编译OpenCV+Contrib静态库(.a)
    使用JavaScript实现复杂功能:动态数据可视化的构建
  • 原文地址:https://blog.csdn.net/hiliang521/article/details/126861500