• oracle10g 监听异常处理


    客户有单机是10.2.0.1.0,突然说无法连接了,赶紧登录查看,lsnrctl没反应,hang住了,实例登录正常,因上班使用高峰,紧急处理方式:先关闭实例后reboot主机,重启后恢复正常。

    [oracle@hydb ~]$ lsnrctl status

    LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 22-SEP-2023 09:15:45

    Copyright (c) 1991, 2005, Oracle.  All rights reserved.

    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.200.100.30)(PORT=1521)))
     
    [oracle@hydb admin]$ lsnrctl stop

    LSNRCTL for Linux: Version 10.2.0.1.0 - Production on 22-SEP-2023 09:16:51

    Copyright (c) 1991, 2005, Oracle.  All rights reserved.

    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.200.100.30)(PORT=1521)))

    恢复正常后开始排查和查文档。

    mos文档

    LSNRCTL commands hang but listener process itself is running (文档 ID 979123.1)

    10g Listener: High CPU Utilization - Listener May Hang (Doc ID 284602.1)

    10gR2: TNS Listener Crash with Core Dump (Doc ID 549932.1)
    10g: Intermittent TNS Listener Hang, New Child Listener Process Forked (Doc ID 340091.1)
    Listener Hangs or Crashes or TNS-12518 & TNS-12540 Error When INBOUND_CONNECT_TIMEOUT_LISTENER = 0 (Doc ID 2830190.1)

    LSNRCTL commands hang but listener process itself is running (Doc ID 979123.1)
    Listener Hangs - TNS-01181: Internal registration connection limit reached (Doc ID 549649.1)
    TNS-12518, TNS-12540, TNS-12582 and TNS-12615 Errors Reported in 11g Listener Log Under Heavy Load (Doc ID 1399677.1)

    处理方法:

    1、在oracle用户下添加参数

    [oracle@hydb ~]$  echo "SUBSCRIBE_FOR_NODE_DOWN_EVENT_LISTENER=OFF"  >> $ORACLE_HOME/network/admin/listener.ora

    需要重启监听才可以生效,等下次维护时间重启

    2、下次故障时,首先查看监听情况,使用命令如下

    [oracle@hydb admin]$   ps -ef |grep LISTENER

    20230922日更新

    再次遇到同样问题,无法lsnrctl操作,只可以kill -9 后手动启动监听

    使用oracle用户 继续优化操作后,继续优化

    $cd $ORACLE_HOME/opmn/conf
    $mv ons.config ons.config.orig
    $ lsnrctl stop ; lsnrctl start


    20230924日更新--机器中毒了,最终解决办法:配置iptables防火墙

    [root@hydb ~]# ps -ef |grep pmon    --未启动实例
    root      4813  4138  0 14:26 pts/1    00:00:00 grep pmon
    [root@hydb ~]# ps -ef |grep LISTENER   --监听正常
    oracle    4391     1  0 13:37 ?        00:00:02 /u01/app/oracle/product/10.2/db_1/bin/tnslsnr LISTENER -inherit
    root      4765  4138  0 14:24 pts/1    00:00:00 grep LISTENER
    [root@hydb ~]# lsof -Pani -p  4391|wc -l     --监听连接统计
    1022

    查找实际的连接,发现该机器中毒,tnslsnr作为客户端连接其他主机

    [root@hydb ~]# lsof -Pani -p  4391
    tnslsnr 4391 oracle  985u  IPv4  18143      0t0  TCP 172.200.100.30:1521->172.16.119.10:47594 (ESTABLISHED)
    tnslsnr 4391 oracle  986u  IPv4  18144      0t0  TCP 172.200.100.30:1521->172.200.32.183:33048 (ESTABLISHED)
    tnslsnr 4391 oracle  987u  IPv4  18157      0t0  TCP 172.200.100.30:1521->172.200.32.183:33050 (ESTABLISHED)
    tnslsnr 4391 oracle  988u  IPv4  18158      0t0  TCP 172.200.100.30:1521->172.200.32.183:33052 (ESTABLISHED)
    tnslsnr 4391 oracle  989u  IPv4  18160      0t0  TCP 172.200.100.30:1521->172.200.32.183:33054 (ESTABLISHED)
    tnslsnr 4391 oracle  990u  IPv4  18161      0t0  TCP 172.200.100.30:1521->172.200.32.183:33056 (ESTABLISHED)
    tnslsnr 4391 oracle  991u  IPv4  18162      0t0  TCP 172.200.100.30:1521->172.200.32.183:33058 (ESTABLISHED)
    tnslsnr 4391 oracle  992u  IPv4  18163      0t0  TCP 172.200.100.30:1521->172.200.32.183:33060 (ESTABLISHED)
    tnslsnr 4391 oracle  993u  IPv4  18164      0t0  TCP 172.200.100.30:1521->172.16.119.10:47596 (ESTABLISHED)
    tnslsnr 4391 oracle  994u  IPv4  18165      0t0  TCP 172.200.100.30:1521->172.200.32.183:33062 (ESTABLISHED)
    tnslsnr 4391 oracle  995u  IPv4  18166      0t0  TCP 172.200.100.30:1521->172.200.32.183:33064 (ESTABLISHED)
    tnslsnr 4391 oracle  996u  IPv4  18167      0t0  TCP 172.200.100.30:1521->172.200.32.183:33066 (ESTABLISHED)
    tnslsnr 4391 oracle  997u  IPv4  18168      0t0  TCP 172.200.100.30:1521->172.200.32.183:33068 (ESTABLISHED)
    tnslsnr 4391 oracle  998u  IPv4  18169      0t0  TCP 172.200.100.30:1521->172.200.32.183:33070 (ESTABLISHED)
    tnslsnr 4391 oracle  999u  IPv4  18482      0t0  TCP 172.200.100.30:1521->172.16.119.10:47598 (ESTABLISHED)
    tnslsnr 4391 oracle 1000u  IPv4  18503      0t0  TCP 172.200.100.30:1521->172.200.32.183:33072 (ESTABLISHED)
    tnslsnr 4391 oracle 1001u  IPv4  18504      0t0  TCP 172.200.100.30:1521->172.200.32.183:33074 (ESTABLISHED)
    tnslsnr 4391 oracle 1002u  IPv4  18505      0t0  TCP 172.200.100.30:1521->172.200.32.183:33078 (ESTABLISHED)

    监听trace文件中显示,到1023就监听就hang住, lsof -Pani -p  4391|wc -l 最大连接是1022
    [24-SEP-2023 14:22:43:560] nsevmute: entry
    [24-SEP-2023 14:22:43:560] nsevmute: cid=3
    [24-SEP-2023 14:22:43:560] nsevmute: normal exit
    [24-SEP-2023 14:22:43:560] nsevwait: 0 posted event(s)
    [24-SEP-2023 14:22:43:560] nsevwait: exit (0)
    [24-SEP-2023 14:22:43:560] nsevwait: entry
    [24-SEP-2023 14:22:43:560] nsevwait: 1022 registered connection(s)
    [24-SEP-2023 14:22:43:560] nsevwait: 0 pre-posted event(s)
    [24-SEP-2023 14:22:43:560] nsevwait: waiting for transport event (1 thru 1023)...

    ##关闭跟踪  LSNRCTL>  set trc_level 0
    ##开启跟踪  LSNRCTL>  set trc_level 16
        Off或者数值0:表示对当前的监听器不开启跟踪;
        Support或者数值16:故障分析级别
    #查看文件名称 LSNRCTL> show trc_file
    #查看文件目录 LSNRCTL> show trc_directory
    #查看跟踪程度 LSNRCTL> show trc_level

    配置iptable自启动
    # chkconfig iptables on  &&  chkconfig --list|grep iptables   
    配置iptable脚本并执行
    # vi /opt/iptables.sh
    service iptables start
    iptables -F
    iptables  -A INPUT -i lo -j ACCEPT
    iptables -A OUTPUT -o lo -j ACCEPT
    iptables -A INPUT -s 127.0.0.1/32 -d 127.0.0.1/32 -j ACCEPT
    iptables -A INPUT -s 172.200.100.60/32 -p tcp -m tcp --dport 1521 -j ACCEPT
    iptables -A INPUT -s 172.200.100.94/32 -p tcp -m tcp --dport 1521 -j ACCEPT
    iptables -A INPUT -s 192.168.100.57/32 -p tcp -m tcp --dport 1521 -j ACCEPT
    iptables -A INPUT -s 172.200.100.42/32 -p tcp -m tcp --dport 21 -j ACCEPT
    iptables -A INPUT -s 172.200.100.42/32 -p tcp -m tcp --dport 22 -j ACCEPT
    iptables -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
    iptables -A INPUT -m state --state INVALID -j DROP
    iptables -A INPUT -p icmp -j ACCEPT
    iptables -A OUTPUT -p icmp -j ACCEPT
    iptables -A FORWARD -m state --state INVALID -j DROP
    iptables -A OUTPUT -m state --state INVALID -j DROP
    iptables -A INPUT -p tcp --dport 22 -j DROP
    iptables -A INPUT -j REJECT --reject-with icmp-port-unreachable
    iptables -A FORWARD -j REJECT --reject-with icmp-port-unreachable
    service iptables save
    service iptables stop && service iptables start
    /bin/sleep 600
    service iptables stop

    # nohup sh /opt/iptables.sh &

  • 相关阅读:
    linux的查找命令
    [[机缘参悟-89]:什么是平台?国家、公司、家庭、硬件、软件、应用?
    代码随想录算法训练营第二十四天丨 回溯算法part02
    解决vagrant安装的centos7,在window主机重装系统过后,再次用vagrant启动centos7却无法启动
    Web信息收集,互联网上的裸奔者
    软件测试|测试方法论—边界值
    自动驾驶--定位技术
    云原生|kubernetes |部署k8s图形化管理组件 kuboard v3
    Android学习笔记 67. A部分:首个交互式UI
    干货分享——银行运维组织如何转向敏捷?
  • 原文地址:https://blog.csdn.net/kevinyu998/article/details/133169180