• exadata的xdwk进程


    参考文档:

    Auto disk management feature in Exadata (Doc ID 1484274.1)
    EXADATA AUTO MANAGEMENT INITIATE DROP AND ADD OF THE GRIDDISKS (Doc ID 1599448.1)

    上次对exadata换盘,没有太留意log信息。

    这次对exadata换盘,留意了以下相关log信息。有一些xdwk进程。

    换盘步骤:

    周一; 在asm层面把grid disk drop掉

    周五;硬盘到了以后,把坏盘拔掉,插上新盘。

    在周一正常drop掉griddisk后,asm层面的alert log中有以下提示,大约5分钟一次。

    1. Thu Nov 03 22:50:43 2022
    2. XDWK started with pid=40, OS id=382866
    3. Thu Nov 03 23:05:46 2022
    4. Starting background process XDWK
    5. Thu Nov 03 23:05:46 2022
    6. XDWK started with pid=40, OS id=31173
    7. Thu Nov 03 23:20:49 2022
    8. Starting background process XDWK
    9. Thu Nov 03 23:20:49 2022
    10. XDWK started with pid=40, OS id=79593
    11. Thu Nov 03 23:35:52 2022
    12. Starting background process XDWK
    13. Thu Nov 03 23:35:52 2022
    14. XDWK started with pid=40, OS id=127598
    15. Thu Nov 03 23:50:55 2022
    16. Starting background process XDWK
    17. Thu Nov 03 23:50:55 2022
    18. XDWK started with pid=40, OS id=173112
    19. Fri Nov 04 00:05:58 2022
    20. Starting background process XDWK
    21. Fri Nov 04 00:05:58 2022

    同时,在XDWK的trace文件中发现以下信息:

    1. *** 2022-11-02 11:30:33.033
    2. *** SESSION ID:(788.329) 2022-11-02 11:30:33.033
    3. *** CLIENT ID:() 2022-11-02 11:30:33.033
    4. *** SERVICE NAME:() 2022-11-02 11:30:33.033
    5. *** MODULE NAME:() 2022-11-02 11:30:33.033
    6. *** ACTION NAME:() 2022-11-02 11:30:33.033
    7. 2022-11-02 11:30:33.032987 : kxdam_is_disk_offline: Operation ID: 614109: in diskgroup Failed.
    8. SQL : /* Exadata Auto Mgmt: Is Disk in the given MODE_STATUS */
    9. select count(disk_number) from v$asm_disk_stat
    10. where
    11. name='DATA_PROD_CD_02_PRODCEL03'
    12. and
    13. mode_status='OFFLINE'
    14. and
    15. group_number in
    16. (
    17. select group_number from v$asm_diskgroup_stat
    18. where
    19. name='DATA_PROD'
    20. and
    21. state in ('MOUNTED', 'RESTRICTED')
    22. )
    23. Cause : Disk not found in offline state.
    24. Action : Check if disk has been dropped from the diskgroup.
    25. If so manually add disk back to the diskgroup.
    26. Ignore this error if disk is part of the diskgroup.

     同时,手工查询这些语句,没有返回值。

    1. [grid@xxxx01 trace]$ sqlplus /nolog
    2. SQL*Plus: Release 11.2.0.4.0 Production on Wed Nov 2 11:37:22 2022
    3. Copyright (c) 1982, 2013, Oracle. All rights reserved.
    4. SQL> conn / as sysasm
    5. Connected.
    6. SQL> select count(disk_number) from v$asm_disk_stat
    7. 2 where
    8. 3 name='RECO_PROD_CD_02_PRODCEL03'
    9. 4 and
    10. 5 (path='o/192.168.0.5/RECO_PROD_CD_02_prodcel03' or mode_status='OFFLINE')
    11. 6 and
    12. 7 group_number in
    13. 8 (
    14. 9 select group_number from v$asm_diskgroup_stat
    15. 10 where
    16. 11 name='RECO_PROD'
    17. 12 and
    18. 13 state in ('MOUNTED', 'RESTRICTED')
    19. 14 );
    20. COUNT(DISK_NUMBER)
    21. ------------------
    22. 0
    23. SQL>

    通过查询MOS,MOS上关于XDWK进程是这样说明的:(看下面的文档,会自动把盘添加到ASM里面去,实际上并没有自动添加到ASM里面去,需要手工添加到ASM里面。可能与人工drop掉磁盘有关,而不是由exadata自动drop掉由问题的盘)

    3. Automatic Storage Management

    The Automatic Storage Management (ASM) instance runs on the compute (database) node and has two processes that implement the auto disk management feature:

    • Exadata Automation Manager (XDMG) initiates automation tasks involved in managing Exadata storage. It monitors all configured storage cells for state changes, such as a failed disk getting replaced, and performs the required tasks for such events. Its primary tasks are to watch for inaccessible disks and cells and when they become accessible again, to initiate the ASM ONLINE operation.
    • Exadata Automation Manager (XDWK) performs automation tasks requested by XDMG. It gets started when asynchronous actions such as disk ONLINE, DROP and ADD are requested by XDMG. After a 5 minute period of inactivity, this process will shut itself down.

    当拔出磁盘后,ASM的alert log中,不在显示XDWK的信息。

    接下来换盘

    换盘后,在cell存储的alert.log中有以下信息:可以看到Grid Disk自动创建了。

    1. 2022-11-04T09:37:34.508930+08:00
    2. create CELLDISK CD_02_prodcel03 on device /dev/sds
    3. 2022-11-04T09:37:34.604668+08:00
    4. create GRIDDISK DATA_prod_CD_02_prodcel03 on CELLDISK CD_02_prodcel03 type 0
    5. GridDisk name=DATA_prod_CD_02_prodcel03 guid=5d3d5a26-b1fd-44c2-9e97-5cd3d640a107 (2749608172) status=GDISK_ACTIVE
    6. 2022-11-04T09:37:34.655927+08:00
    7. create GRIDDISK RECO_prod_CD_02_prodcel03 on CELLDISK CD_02_prodcel03 type 0
    8. GridDisk name=RECO_prod_CD_02_prodcel03 guid=7743d6e9-1537-4bd8-bc6c-a43a10985ede (3983708148) status=GDISK_ACTIVE
    9. 2022-11-04T09:37:34.685691+08:00
    10. create GRIDDISK DBFS_DG_CD_02_prodcel03 on CELLDISK CD_02_prodcel03 type 0
    11. GridDisk name=DBFS_DG_CD_02_prodcel03 guid=60463395-c176-4edd-8d57-875900f06b9e (4287003124) status=GDISK_ACTIVE

     接下来,将griddisk添加到ASM层面,结束。

    END

  • 相关阅读:
    JAVA编程:设计模式原则
    gradle系列:理解Project.afterEvaluate
    SteamVR 2.x 关闭SteamVR弹窗提醒(16)
    无线通信技术_Fundamentals of Wireless Communication
    Spring Boot中@Import三种使用方式!
    android compose Canvas 绘制图案居中展示
    React + 项目(从基础到实战) -- 第八期
    Windows OpenGL ES 图像伽马线
    机器学习--决策树(sklearn)
    元数据概述
  • 原文地址:https://blog.csdn.net/xxzhaobb/article/details/127689496