• 记一次Netty堆外内存溢出OutOfDirectMemoryError


    简介

    最近项目上线,隔几天就出现一次OutOfDirectMemoryError,项目使用netty进行网络数据处理,笔者输出这篇文章记录分析与处理的过程,分享给所有读者,希望能帮助到遇到同样问题的人。(备注:本篇文章为笔者模拟当时的情况,还原问题处理过程。)

    错误信息

    10:14:02.857 [AlarmFile-7-2] WARN netty.channel.DefaultChannelPipeline[1152]An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
    io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 1073741824, max: 1073741824)
        at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:802) ~[netty-common-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:731) ~[netty-common-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:648) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:623) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:202) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.PoolArena.tcacheAllocateSmall(PoolArena.java:172) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:134) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:126) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:395) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        at io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:116) ~[netty-buffer-4.1.72.Final.jar:4.1.72.Final]
        ...
    

    日志显示OutOfDirectMemoryError,来自程序中netty分配堆外内存的时候。

    问题分析

    跟踪堆外内存

    先跟踪一下堆外内存使用情况,netty里提供了相应的api查询堆外内存使用情况,在程序中添加如下代码:

    LOG.info("directmemory use:{}, max:{}",
    PlatformDependent.usedDirectMemory(), PlatformDependent.maxDirectMemory());
    

    然后跟踪一下日志:

    10:17:48.178 [*******-4-4] INFO ****************[66]directmemory use:234881024, max:1073741824
    10:17:53.373 [*******-4-1] INFO ****************[66]directmemory use:234881024, max:1073741824
    10:17:57.467 [*******-4-4] INFO ****************[66]directmemory use:234881024, max:1073741824
    10:17:57.599 [*******-4-4] INFO ****************[66]directmemory use:234881024, max:1073741824
    10:17:57.617 [*******-4-2] INFO ****************[66]directmemory use:234881024, max:1073741824
    10:17:58.638 [*******-4-1] INFO ****************[66]directmemory use:234881024, max:1073741824
    10:17:58.748 [*******-4-4] INFO ****************[66]directmemory use:234881024, max:1073741824
    10:17:59.540 [*******-4-2] INFO ****************[66]directmemory use:234881024, max:1073741824
    ...
    10:50:51.873 [*******-4-2] INFO ****************[66]directmemory use:268435456, max:1073741824
    10:50:58.220 [*******-4-4] INFO ****************[66]directmemory use:268435456, max:1073741824
    10:50:58.973 [*******-4-5] INFO ****************[66]directmemory use:268435456, max:1073741824
    ...
    11:08:49.983 [*******-4-5] INFO ****************[66]directmemory use:301989888, max:1073741824
    11:08:52.401 [*******-4-1] INFO ****************[66]directmemory use:301989888, max:1073741824
    11:08:53.459 [*******-4-2] INFO ****************[66]directmemory use:301989888, max:1073741824
    

    观察到的现象是:堆外内存一直增长不降,没有回收,这种情况大概率是代码中出现了内存泄露。

    Dump分析

    先用arthas抓几个dump看看情况

    java -jar arthas-boot.jar
    ...
    heapdump /tmp/*****-08291045-dump.hprof
    

    在这里插入图片描述
    在这里插入图片描述
    抓取了2个时间点的dump文件,发现堆外内存中netty的PooledUnsafeDirectByteBuf数量增长很多。
    到这里只能知道是代码中ByteBuf引起的,但具体是哪里的ByteBuf引起的呢?

    定位泄露堆栈

    netty的官方文档,有两种方式可以 打印出泄漏的详细信息
    第一种是在netty定义时添加如下代码:

    ResourceLeakDetector.setLevel(ResourceLeakDetector.Level.ADVANCED);
    

    第二种是在jvm中添加如下参数:

    -Dio.netty.leakDetectionLevel=advanced
    

    这里笔者选用第二种方式,添加参数后重新启动项目,观察错误日志:

    16:48:33.249 [808-TCP-4-1] ERROR netty.util.ResourceLeakDetector[319]LEAK: ByteBuf.release() was not called before it's garbage-collected. See https://netty.io/wiki/reference-counted-objects.html for more information.
    Recent access records: 
    #1:
    	io.netty.buffer.AdvancedLeakAwareByteBuf.readUnsignedInt(AdvancedLeakAwareByteBuf.java:443)
    	com.*****.***********.******.protostar.schema.NumberPSchema$DWORD2Long.readAndSet(NumberPSchema.java:177)
    	com.*****.***********.******.protostar.schema.RuntimeSchema.mergeFrom(RuntimeSchema.java:63)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decryptMessage(JTMessageDecoder.java:160)
    	...
    #2:
    	io.netty.buffer.AdvancedLeakAwareByteBuf.readBytes(AdvancedLeakAwareByteBuf.java:497)
    	com.*****.***********.******.protostar.schema.StringSchema$STR.readFrom(StringSchema.java:58)
    	com.*****.***********.******.protostar.schema.StringSchema$STR.readFrom(StringSchema.java:38)
    	com.*****.***********.******.protostar.Schema.readFrom(Schema.java:17)
    	...
    #3:
    	io.netty.buffer.AdvancedLeakAwareByteBuf.getUnsignedByte(AdvancedLeakAwareByteBuf.java:161)
    	io.netty.buffer.ByteBufUtil$HexUtil.hexDump(ByteBufUtil.java:1496)
    	io.netty.buffer.ByteBufUtil$HexUtil.access$000(ByteBufUtil.java:1420)
    	io.netty.buffer.ByteBufUtil.hexDump(ByteBufUtil.java:143)
    	io.netty.buffer.ByteBufUtil.hexDump(ByteBufUtil.java:135)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decrypt(JTMessageDecoder.java:238)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decryptMessage(JTMessageDecoder.java:145)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decode(JTMessageDecoder.java:128)
    	...
    #4:
    	io.netty.buffer.AdvancedLeakAwareByteBuf.writeBytes(AdvancedLeakAwareByteBuf.java:611)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decrypt(JTMessageDecoder.java:237)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decryptMessage(JTMessageDecoder.java:145)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decode(JTMessageDecoder.java:128)
    	...
    Created at:
    	io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:402)
    	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
    	io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
    	io.netty.buffer.AbstractByteBufAllocator.buffer(AbstractByteBufAllocator.java:116)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decrypt(JTMessageDecoder.java:229)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decryptMessage(JTMessageDecoder.java:145)
    	com.*****.***********.******.protocol.codec.JTMessageDecoder.decode(JTMessageDecoder.java:128)
    	...
    

    打印的错误堆栈里就可以看到具体泄露的地方了。
    找到这些泄露的地方,我们做release释放就可以解决问题了。

    解决问题

    对泄露的bytebuf确保释放即可:

    private void decryptMessage(...) {
        // ...
        try {
            // ...
        } catch (Exception e) {
            throw e;
        } finally {
        	// 释放 release 或者 ReferenceCountUtil.safeRelease(buf);   
            buf.release(); 
        }
    }
    

    更新程序后上线,观察运行情况,堆外内存没有再出现异常情况。

  • 相关阅读:
    手把手教你前后分离架构(五) SpringBoot连接数据库
    C. Keshi Is Throwing a Party- Codeforces Global Round 17
    好用的电容笔有哪些?2022年电容笔十大品牌排行榜
    炎炎夏日,清凉上线:博客园T恤丝光棉款上架预售
    互联网大厂面试必问的40个SpringBoot面试题【建议收藏】
    贪心算法的高逼格应用——Huffman编码
    电子招标采购商城系统:优化传统采购业务,提速企业数字化升级
    让Unity打包AssetBundle更轻松
    MYSQL(基本篇)——一篇文章带你走进MYSQL的奇妙世界
    Nginx查找耗时的接口
  • 原文地址:https://blog.csdn.net/dzh284616172/article/details/127037798