• hadoop集群中主节点的FsImage没自动更新,上传失败导致主NN FsImage没更新


    2023-10-08 17:44:06,189 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to retain 2 images with txid >= 174892162
    2023-10-08 17:44:06,189 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old image FSImageFile(file=/data2/usdp/hadoop/hdfs/nn/current/fsimage_0000000000174888649, cpktTxId=0000000000174888649)
    2023-10-08 17:44:06,280 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old image FSImageFile(file=/data3/usdp/hadoop/hdfs/nn/current/fsimage_0000000000174888649, cpktTxId=0000000000174888649)
    2023-10-08 17:44:06,619 ERROR org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Exception in doCheckpoint
    java.io.IOException: Exception during image upload
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:315)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1300(StandbyCheckpointer.java:64)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.doWork(StandbyCheckpointer.java:480)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.access$600(StandbyCheckpointer.java:383)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread$1.run(StandbyCheckpointer.java:403)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:360)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1855)
            at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:501)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$CheckpointerThread.run(StandbyCheckpointer.java:399)
    Caused by: java.util.concurrent.ExecutionException: java.net.ConnectException: Error while authenticating with endpoint: https://ucd-prod-vdp-usdp-102.viatris.cc:9871/imagetransfer?imageFile=IMAGE&txid=174912350&storageInfo=-66%3A1517630335%3A1679815146170%3ACID-f1478d9f-9969-468d-93aa-02079de30057&File-Length=428077109
            at java.util.concurrent.FutureTask.report(FutureTask.java:122)
            at java.util.concurrent.FutureTask.get(FutureTask.java:192)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:295)
            ... 9 more
    Caused by: java.net.ConnectException: Error while authenticating with endpoint: https://ucd-prod-vdp-usdp-102.viatris.cc:9871/imagetransfer?imageFile=IMAGE&txid=174912350&storageInfo=-66%3A1517630335%3A1679815146170%3ACID-f1478d9f-9969-468d-93aa-02079de30057&File-Length=428077109
            at sun.reflect.GeneratedConstructorAccessor83.newInstance(Unknown Source)
            at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
            at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
            at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:232)
            at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:216)
            at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:350)
            at org.apache.hadoop.hdfs.web.URLConnectionFactory.openConnection(URLConnectionFactory.java:186)
            at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:290)
            at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:249)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:277)
            at org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:272)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)
    Caused by: java.net.ConnectException: Connection refused (Connection refused)
            at java.net.PlainSocketImpl.socketConnect(Native Method)
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
            at java.net.Socket.connect(Socket.java:589)
            at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:666)
            at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
            at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
            at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
            at sun.net.www.protocol.https.HttpsClient.(HttpsClient.java:264)
            at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367)
            at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191)
            at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1156)
            at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
            at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177)
            at sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:162)
            at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:189)
            ... 10 more
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56

    在这里插入图片描述看问题:
    主节点的FsImage没自动更新。查了一圈,配置没问题,slave节点FsImage会周期更新。就看到log中有这个,才发现是上传失败导致主NN FsImage没更新。
    缺少端口号

    在这里插入图片描述

  • 相关阅读:
    条件随机场CRF
    PHP 生成 PDF文件
    使用MotionLayout实现模拟启动页动画和轮播图
    二叉树实现表达式求值(C++)
    docker network create命令
    数据可视化大屏:重新定义商业智能的展现方式
    vue配置
    百度移动权重查询易语言代码
    docker-compose安装redis
    机器学习|模型评估——AUC
  • 原文地址:https://blog.csdn.net/qq_43688472/article/details/133687602