• Oracle Exadata换盘操作-Replacing a Hard Disk Proactively


    本操作已经在生产环境中实施(cell节点),记录操作过程(大概过程,部分命令为docs文档命令,部分为实际操作命令)。

    参考文档:

    Maintaining Oracle Exadata Storage Servers

    3.3.6 Replacing a Hard Disk Proactively

    How to Replace a Hard Drive in an Exadata Storage Cell Server (Hard Failure) (Doc ID 1386147.1)
    How to Replace a Hard Drive in an Exadata Storage Cell Server (Predictive Failure) (Doc ID 1390836.1)
    决定在什么时候应该更换Exadata服务器上的硬盘 (Doc ID 2661785.1)

    Exadata ALTER PHYSICALDISK N:N DROP FOR REPLACEMENT is hung (Doc ID 2574663.1)  

    Exadata Storage software has a complete set of automated operations for hard disk maintenance, when a hard disk has failed or has been flagged as a problematic disk. But there are situations where a hard disk has to be removed proactively from the configuration.

    In the CellCLI ALTER PHYSICALDISK command, the drop for replacement option checks if a normal functioning hard disk can be removed safely without the risk of data lost. However, after the execution of the command, the grid disks on the hard disk are inactivated on the storage cell and set to offline in the Oracle ASM disk groups.

    The redundancy of the disk group is compromised until the hard disk has been replaced or re-enabled, and the subsequent rebalance completes. This is especially important for disk groups using normal redundancy.

    To reduce the risk of having a disk group without full redundancy and proactively replace a hard disk, follow this procedure:

    确认物理硬盘,关联的LUN、celldisk、griddisk

    # cellcli –e "list diskmap" | grep 'X:Y'

    结果类似下面:

    1. 20:5 KEBTDJ 5 normal 559G
    2. CD_05_exaceladm01 /dev/sdf
    3. "DATAC1_CD_05_exaceladm01, DBFS_DG_CD_05_exaceladm01,
    4. RECOC1_CD_05_exaceladm01"

    查看LUN的信息

    1. CellCLI> list lun where deviceName='/dev/sdf/'
    2. 0_5 0_5 normal

    在ASM层面drop掉griddisk

    SQL> ALTER DISKGROUP diskgroup_name DROP DISK asm_disk_name;

    等待完成reblance 

    SQL> select * from v$asm_operation;
    

    drop磁盘for replace

    1. CellCLI> alter physicaldisk 20:4 serviceled on -- 之前的方法,点亮灯,已经被淘汰,无法使用
    2. ALTER PHYSICALDISK 20:4 DROP FOR REPLACEMENT; -- 是使用这个命令,但是会hung住,具体解决方法参考前面的参考文档

    执行完毕上面的drop for replace后,存储cell上,硬盘的灯会变成蓝色。(Cell上有个HDD MAP,可以看硬盘在那个插槽,为了确保准确,还是将该硬盘的灯点亮)

    替换硬盘,拔掉硬盘,官方文档建议等待3分钟后插入硬盘(实际操作,没有等待3分钟)

    查看LUN、celldisk、griddisk信息

    1. CellCLI> list lun lun_name
    2. CellCLI> list celldisk where lun=lun_name
    3. CellCLI> list griddisk where celldisk=celldisk_name

    确认磁盘已经加入到ASM中,以下查询会返回0. 如果没有加入,则需要手工加入,一般情况,LUN、Celldisk、griddisk会自动创建(在cell的alertlog中可以看到)。

    SQL> SELECT path,header_status FROM v$asm_disk WHERE group_number=0;

    手工加入磁盘到ASM

    1. alter diskgroup DATA_ABC add disk 'o/192.168.0.1/DATA_ABC_CD_04_abccel02' rebalance power 4;
    2. alter diskgroup RECO_ABC add disk 'o/192.168.0.1/RECO_ABC_CD_04_abccel02' rebalance power 4;
    3. alter diskgroup DBFS_DG add disk 'o/192.168.0.1/DBFS_DG_CD_04_abccel02' rebalance power 4;

    查看reblance。完工。

    补充:如果拔错盘了。怎么处理,再插进去。官方文档有说明

    3.3.9 Removing and Replacing the Same Hard Disk

    What happens if you accidentally remove the wrong hard disk?

    If you inadvertently remove the wrong hard disk, then put the disk back. It will automatically be added back in the Oracle ASM disk group, and its data is resynchronized.

    如果盘插入到了错误的插槽,被reject了,怎么处理,官方文档,re-enable

    3.3.10 Re-Enabling a Hard Disk That Was Rejected

    If a physical disk was rejected because it was inserted into the wrong slot, you can re-enable the disk.

    Run the following command:

    Caution:

    The following command removes all data on the physical disk.

    CellCLI> ALTER PHYSICALDISK hard_disk_name reenable force
    

    The following is an example of the output from the command:

    Physical disk 20:0 was reenabled.

    END

  • 相关阅读:
    为什么我们要选择甲方型IT研发型IT-因为这是一个正确价值观问题
    opencv dnn模块 示例(19) 目标检测 object_detection 之 yolox
    java设计模式
    在SpringBoot中使用Spring-AOP实现接口鉴权
    如何安装 ONLYOFFICE Workspace丨安装教程丨使用教程
    【Android】WebView 基本使用
    什么是覆盖索引?
    webpack5零基础入门-10babel的使用
    万宾科技管网水位监测预警,管网水位的特点有哪些?
    玩转代码|分享一些实用的Vue 前端代码(三)
  • 原文地址:https://blog.csdn.net/xxzhaobb/article/details/126389951