• BE节点经常挂掉:[IO_ERROR]failed to list /proc/27349/fd/: No such file or directory


    最近BE节点经常挂掉

    Caused by: java.lang.RuntimeException: Failed to execute internal SQL. org.apache.doris.common.UserException: errCode = 2, detailMessage = There is no scanNode Backend available.[10031: not alive] OriginStatement{originStmt='SELECT * FROM __internal_schema.column_statistics WHERE tbl_id=27273 AND idx_id=-1 AND col_id='CREATE_AID'', idx=0}
            at org.apache.doris.qe.StmtExecutor.executeInternalQuery(StmtExecutor.java:2509)
            at org.apache.doris.statistics.util.StatisticsUtil.execStatisticQuery(StatisticsUtil.java:131)
            at org.apache.doris.statistics.StatisticsRepository.loadColStats(StatisticsRepository.java:439)
            at org.apache.doris.statistics.ColumnStatisticsCacheLoader.loadFromStatsTable(ColumnStatisticsCacheLoader.java:56)
            at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:38)
            at org.apache.doris.statistics.ColumnStatisticsCacheLoader.doLoad(ColumnStatisticsCacheLoader.java:31)
            at org.apache.doris.statistics.StatisticsCacheLoader.lambda$asyncLoad$0(StatisticsCacheLoader.java:48)
            at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
            ... 3 more
    Caused by: org.apache.doris.common.UserException: errCode = 2, detailMessage = There is no scanNode Backend available.[10031: not alive]
            at org.apache.doris.qe.SimpleScheduler.getHost(SimpleScheduler.java:147)
            at org.apache.doris.qe.Coordinator.computeFragmentHosts(Coordinator.java:1806)
            at org.apache.doris.qe.Coordinator.computeFragmentExecParams(Coordinator.java:1267)
            at org.apache.doris.qe.Coordinator.exec(Coordinator.java:573)
            at org.apache.doris.qe.StmtExecutor.executeInternalQuery(StmtExecutor.java:2505)
            ... 10 more
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17

    be.out也看不出什么有用日志,查看be.WARNING,发现了如下错误,但还不知道如何解决,先记录一下问题

    [IO_ERROR]failed to list /proc/27349/fd/: (2), No such file or directory

    W1121 09:36:26.929662 27477 doris_metrics.cpp:379] failed to count fd: [IO_ERROR]failed to list /proc/27349/fd/: (2), No such file or directory
    0. /root/src/doris-2.0/be/src/common/stack_trace.cpp:302: StackTrace::tryCapture() @ 0x000000000b9e64c7 in /xxsys/doris-2.0.2/be/lib/doris_be
    1. /root/src/doris-2.0/be/src/common/stack_trace.h:0: doris::get_stack_trace[abi:cxx11]() @ 0x000000000b9e4ae5 in /xxsys/doris-2.0.2/be/lib/doris_be
    2. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:173: doris::Status doris::Status::Error, std::allocator > const&, std::__cxx11::basic_string, std::allocator > >(int, std::basic_string_view >, std::__cxx11::basic_string, std::allocator > const&, std::__cxx11::basic_string, std::allocator >&&) @ 0x000000000aecc168 in /xxsys/doris-2.0.2/be/lib/doris_be
    3. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187: doris::io::LocalFileSystem::list_impl(std::filesystem::__cxx11::path const&, bool, std::vector >*, bool*) @ 0x000000000aec6eac in /xxsys/doris-2.0.2/be/lib/doris_be
    4. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:360: doris::io::LocalFileSystem::iterate_directory_impl(std::__cxx11::basic_string, std::allocator > const&, std::function const&) @ 0x000000000aec7fcf in /xxsys/doris-2.0.2/be/lib/doris_be
    5. /root/src/doris-2.0/be/src/common/status.h:348: doris::io::LocalFileSystem::iterate_directory(std::__cxx11::basic_string, std::allocator > const&, std::function const&) @ 0x000000000aec7e4d in /xxsys/doris-2.0.2/be/lib/doris_be
    6. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:244: doris::DorisMetrics::_update_process_fd_num() @ 0x000000000b97a65a in /xxsys/doris-2.0.2/be/lib/doris_be
    7. /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_tree.h:368: doris::MetricRegistry::trigger_all_hooks(bool) const @ 0x000000000b9ba69f in /xxsys/doris-2.0.2/be/lib/doris_be
    8. /root/src/doris-2.0/be/src/util/time.h:50: doris::Daemon::calculate_metrics_thread() @ 0x000000000ae9cc0c in /xxsys/doris-2.0.2/be/lib/doris_be
    9. /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562: doris::Thread::supervise_thread(void*) @ 0x000000000ba1819a in /xxsys/doris-2.0.2/be/lib/doris_be
    10. start_thread @ 0x00007f2f98172aa1 in ?
    11. __clone @ 0x00007f2f988f8c4d in ?
    
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
  • 相关阅读:
    解决chrome extension popup最大宽高限制(800x600)
    y48.第三章 Kubernetes从入门到精通 -- Pod的状态和探针(二一)
    Matplotlib网格制作
    Spring IOC - Bean的生命周期之依赖注入
    视频号迎来重大更新,这些功能久等了
    SpringCloud相关理论概念集合
    【iOS】—— GET和POST以及AFNetworking框架
    C++ 配置VSCode开发环境
    DJ12-1 8086系列指令系统-2 数据传送指令
    Qt制作18帧丘比特表白意中人、是你的丘比特嘛!!!
  • 原文地址:https://blog.csdn.net/chengyuqiang/article/details/134525315