线上突然抛出下面的异常
2019-09-26 15:35:22,325 ERROR [DubboClientHandler-ip2:15196-thread-2] c.a.d.r.e.s.h.HeaderExchangeHandler:: {} [DUBBO] Dubbo client can not supported string message: in channel: NettyChannel [channel=[id: 0x5210fa0a, /ip1:43814 => /ip2:15196]], url: dubbo://ip2:15196/com.....alge.eta.service.EtaService?accesslog=false&anyhost=true&application=system-dispatch&check=false&codec=dubbo&default.executes=100&default.loadbalance=roundrobin&dubbo=2.5.3&heartbeat=60000&interface=com.....alge.eta.service.EtaService&logger=slf4j&methods=getModelResult,getXingTangModelResult&pid=1&retries=0&side=consumer&timeout=400×tamp=1568871901241, dubbo version: 2.5.3, current host: ip1
java.lang.Exception: Dubbo client can not supported string message: in channel: NettyChannel [channel=[id: 0x5210fa0a, /ip1:43814 => /ip2:15196]], url: dubbo://ip2:15196/com.....alge.eta.service.EtaService?accesslog=false&anyhost=true&application=system-dispatch&check=false&codec=dubbo&default.executes=100&default.loadbalance=roundrobin&dubbo=2.5.3&heartbeat=60000&interface=com.....alge.eta.service.EtaService&logger=slf4j&methods=getModelResult,getXingTangModelResult&pid=1&retries=0&side=consumer&timeout=400×tamp=1568871901241
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.received(HeaderExchangeHandler.java:180) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.remoting.transport.DecodeHandler.received(DecodeHandler.java:52) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.remoting.transport.dispatcher.ChannelEventRunnable.run(ChannelEventRunnable.java:82) ~[dubbo-2.5.3.jar!/:2.5.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
这个异常有些少,于是我们查看应用日志期望可以发现更多信息,根据什么线索呢?线上的trace_id没有打印,那么只能根据线程日志查看,也就是:DubboClientHandler-ip2-thread-2。于是发现下面的异常日志
2019-09-26 15:35:22,310 WARN [DubboClientHandler-ip2:15196-thread-2] c.a.d.r.e.s.DefaultFuture:: {} [DUBBO] The timeout response finally returned at 2019-09-26 15:35:22.310, response Response [id=1254844, version=null, status=50, event=false, error=Failed to send response: Response [id=1254844, version=2.0.0, status=20, event=false, error=null, result=RpcResult [result=[com.....alge.eta.dto.ResultRiderDTO@17bdd760, com.....alge.eta.dto.ResultRiderDTO@198283d3, com.....alge.eta.dto.ResultRiderDTO@323b5e16, com.....alge.eta.dto.ResultRiderDTO@66d2204d, com.....alge.eta.dto.ResultRiderDTO@43fc4bb6, com.....alge.eta.dto.ResultRiderDTO@3b1ffa8c, com.....alge.eta.dto.ResultRiderDTO@241b5a75, com.....alge.eta.dto.ResultRiderDTO@59b14bb], exception=null]], cause: java.util.ConcurrentModificationException
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
at java.util.ArrayList$Itr.next(ArrayList.java:859)
at com.alibaba.com.caucho.hessian.io.CollectionSerializer.writeObject(CollectionSerializer.java:100)
at com.alibaba.com.caucho.hessian.io.Hessian2Output.writeObject(Hessian2Output.java:408)
at com.alibaba.com.caucho.hessian.io.JavaSerializer.writeObject(JavaSerializer.java:203)
at com.alibaba.com.caucho.hessian.io.Hessian2Output.writeObject(Hessian2Output.java:408)
at com.alibaba.dubbo.common.serialize.support.hessian.Hessian2ObjectOutput.writeObject(Hessian2ObjectOutput.java:92)
at com.alibaba.dubbo.rpc.protocol.dubbo.DubboCodec.encodeResponseData(DubboCodec.java:200)
at com.alibaba.dubbo.remoting.exchange.codec.ExchangeCodec.encodeResponse(ExchangeCodec.java:276)
at com.alibaba.dubbo.remoting.exchange.codec.ExchangeCodec.encode(ExchangeCodec.java:77)
at com.alibaba.dubbo.rpc.protocol.dubbo.DubboCountCodec.encode(DubboCountCodec.java:39)
at com.alibaba.dubbo.remoting.transport.netty.NettyCodecAdapter$InternalEncoder.encode(NettyCodecAdapter.java:81)
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:66)
at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendDownstream(DefaultChannelPipeline.java:776)
at org.jboss.netty.channel.SimpleChannelHandler.writeRequested(SimpleChannelHandler.java:304)
at com.alibaba.dubbo.remoting.transport.netty.NettyHandler.writeRequested(NettyHandler.java:99)
at org.jboss.netty.channel.SimpleChannelHandler.handleDownstream(SimpleChannelHandler.java:266)
at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:591)
at org.jboss.netty.channel.DefaultChannelPipeline.sendDownstream(DefaultChannelPipeline.java:582)
at org.jboss.netty.channel.Channels.write(Channels.java:611)
at org.jboss.netty.channel.Channels.write(Channels.java:578)
at org.jboss.netty.channel.AbstractChannel.write(AbstractChannel.java:251)
at com.alibaba.dubbo.remoting.transport.netty.NettyChannel.send(NettyChannel.java:98)
at com.alibaba.dubbo.remoting.transport.AbstractPeer.send(AbstractPeer.java:51)
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.received(HeaderExchangeHandler.java:171)
at com.alibaba.dubbo.remoting.transport.DecodeHandler.received(DecodeHandler.java:52)
at com.alibaba.dubbo.remoting.transport.dispatcher.ChannelEventRunnable.run(ChannelEventRunnable.java:82)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
, result=null], channel: /ip1:43814 -> /ip2:15196, dubbo version: 2.5.3, current host: ip1
问题已经清楚,将异常信息反馈给接口提供方。根据响应response id(1254844)查找与之匹配的请求request id(1254844),查找到对应的业务日志定位到出现问题的trace_id(订单id),根据订单id定位到业务影响点,确认影响范围
2019-09-26 15:35:21,645 ERROR [immediately-dispatch-44] c.d.d.f.r.r.i.OrderSendTimeOutFilter:getModelResult:287 {orderId=483047323593828352} 调用送达时间预测模型异常,etaService#getModelResult
com.alibaba.dubbo.rpc.RpcException: Failed to invoke the method getXingTangModelResult in the service com.....alge.eta.service.EtaService. Tried 1 times of the providers [ip2:15196] (1/2) from the registry 172.24.0.6:2186 on the consumer ip1 using the dubbo version 2.5.3. Last error is: Invoke remote method timeout. method: getXingTangModelResult, provider: dubbo://ip2:15196/com.....alge.eta.service.EtaService?accesslog=false&anyhost=true&application=system-dispatch&check=false&default.executes=100&default.loadbalance=roundrobin&dubbo=2.5.3&interface=com.....alge.eta.service.EtaService&logger=slf4j&methods=getModelResult,getXingTangModelResult&pid=1&retries=0&side=consumer&timeout=400×tamp=1568871901241, cause: Waiting server-side response timeout. start time: 2019-09-26 15:35:21.244, end time: 2019-09-26 15:35:21.645, client elapsed: 0 ms, server elapsed: 401 ms, timeout: 400 ms, request: Request [id=1254844, version=2.0.0, twoway=true, event=false, broken=false, data=RpcInvocation [methodName=getXingTangModelResult, parameterTypes=[interface java.util.List], arguments=[[com.....alge.eta.dto.EtaRiderDTO@7de6164, com.....alge.eta.dto.EtaRiderDTO@44ac0621, com.....alge.eta.dto.EtaRiderDTO@60af7dfc, com.....alge.eta.dto.EtaRiderDTO@3940bef0, com.....alge.eta.dto.EtaRiderDTO@2bfb5818, com.....alge.eta.dto.EtaRiderDTO@380cd692, com.....alge.eta.dto.EtaRiderDTO@2ce11da9, com.....alge.eta.dto.EtaRiderDTO@37614eb1, com.....alge.eta.dto.EtaRiderDTO@7b618db0, com.....alge.eta.dto.EtaRiderDTO@2116bec, com.....alge.eta.dto.EtaRiderDTO@27733f2, com.....alge.eta.dto.EtaRiderDTO@1ed63260, com.....alge.eta.dto.EtaRiderDTO@593b5139, com.....alge.eta.dto.EtaRiderDTO@47a389e5, com.....alge.eta.dto.EtaRiderDTO@81e1, com.....alge.eta.dto.EtaRiderDTO@7dfd635c, com.....alge.eta.dto.EtaRiderDTO@669f1bed, com.....alge.eta.dto.EtaRiderDTO@7e872fed]], attachments={path=com.....alge.eta.service.EtaService, interface=com.....alge.eta.service.EtaService, version=0.0.0, timeout=400}]], channel: /ip1:43814 -> /ip2:15196
at com.alibaba.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:101) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:227) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:72) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.cluster.support.AvailableCluster$1.doInvoke(AvailableCluster.java:43) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:227) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:72) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:52) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.common.bytecode.proxy18.getXingTangModelResult(proxy18.java) ~[?:2.5.3]
at com.....dispatch.filter.rules.riderfilter.impl.OrderSendTimeOutFilter.getModelResult(OrderSendTimeOutFilter.java:281) ~[dispatch-filter-rules-1.72.0.jar!/:1.72.0]
at com.....dispatch.filter.rules.riderfilter.impl.OrderSendTimeOutFilter.beforeFilter(OrderSendTimeOutFilter.java:101) ~[dispatch-filter-rules-1.72.0.jar!/:1.72.0]
at com.....dispatch.filter.rules.riderfilter.AbstractRiderFilter.filter(AbstractRiderFilter.java:34) ~[dispatch-filter-rules-1.72.0.jar!/:1.72.0]
at com.....dispatch.struct.dispatcher.AbstractSingleDispatcher.filterInner(AbstractSingleDispatcher.java:93) ~[dispatch-struct-1.0-SNAPSHOT.jar!/:1.0-SNAPSHOT]
at com.....dispatch.struct.dispatcher.AbstractSingleDispatcher.filterRiders(AbstractSingleDispatcher.java:87) ~[dispatch-struct-1.0-SNAPSHOT.jar!/:1.0-SNAPSHOT]
at com.....dispatch.dispatch.SystemSingleDispatcher.filterRiders(SystemSingleDispatcher.java:382) ~[system-dispatch-impl-0.0.1-SNAPSHOT.jar!/:0.0.1-SNAPSHOT]
at com.....dispatch.struct.dispatcher.ConfigurableAbstractSingleAsyncDispatcher.dispatch(ConfigurableAbstractSingleAsyncDispatcher.java:33) ~[dispatch-struct-1.0-SNAPSHOT.jar!/:1.0-SNAPSHOT]
at com.....dispatch.provider.SystemDispatchProviderImpl.startDispatch(SystemDispatchProviderImpl.java:112) ~[system-dispatch-impl-0.0.1-SNAPSHOT.jar!/:0.0.1-SNAPSHOT]
at com.....dispatch.provider.SystemDispatchProviderImpl.lambda$null$0(SystemDispatchProviderImpl.java:65) ~[system-dispatch-impl-0.0.1-SNAPSHOT.jar!/:0.0.1-SNAPSHOT]
at com.....wireless.threadpool.MonitoringTask.run(MonitoringTask.java:50) ~[wireless-threadpool-0.0.1-SNAPSHOT.jar!/:0.0.1-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: com.alibaba.dubbo.remoting.TimeoutException: Waiting server-side response timeout. start time: 2019-09-26 15:35:21.244, end time: 2019-09-26 15:35:21.645, client elapsed: 0 ms, server elapsed: 401 ms, timeout: 400 ms, request: Request [id=1254844, version=2.0.0, twoway=true, event=false, broken=false, data=RpcInvocation [methodName=getXingTangModelResult, parameterTypes=[interface java.util.List], arguments=[[com.....alge.eta.dto.EtaRiderDTO@7de6164, com.....alge.eta.dto.EtaRiderDTO@44ac0621, com.....alge.eta.dto.EtaRiderDTO@60af7dfc, com.....alge.eta.dto.EtaRiderDTO@3940bef0, com.....alge.eta.dto.EtaRiderDTO@2bfb5818, com.....alge.eta.dto.EtaRiderDTO@380cd692, com.....alge.eta.dto.EtaRiderDTO@2ce11da9, com.....alge.eta.dto.EtaRiderDTO@37614eb1, com.....alge.eta.dto.EtaRiderDTO@7b618db0, com.....alge.eta.dto.EtaRiderDTO@2116bec, com.....alge.eta.dto.EtaRiderDTO@27733f2, com.....alge.eta.dto.EtaRiderDTO@1ed63260, com.....alge.eta.dto.EtaRiderDTO@593b5139, com.....alge.eta.dto.EtaRiderDTO@47a389e5, com.....alge.eta.dto.EtaRiderDTO@81e1, com.....alge.eta.dto.EtaRiderDTO@7dfd635c, com.....alge.eta.dto.EtaRiderDTO@669f1bed, com.....alge.eta.dto.EtaRiderDTO@7e872fed]], attachments={path=com.....alge.eta.service.EtaService, interface=com.....alge.eta.service.EtaService, version=0.0.0, timeout=400}]], channel: /ip1:43814 -> /ip2:15196
at com.alibaba.dubbo.remoting.exchange.support.DefaultFuture.get(DefaultFuture.java:107) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.remoting.exchange.support.DefaultFuture.get(DefaultFuture.java:84) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.protocol.dubbo.DubboInvoker.doInvoke(DubboInvoker.java:96) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.protocol.AbstractInvoker.invoke(AbstractInvoker.java:144) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.listener.ListenerInvokerWrapper.invoke(ListenerInvokerWrapper.java:74) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.monitor.support.MonitorFilter.invoke(MonitorFilter.java:75) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:91) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.....wireless.dubbo.filter.InvokeResultLogFilter.invoke(InvokeResultLogFilter.java:36) ~[dubbo-integration-0.0.1-SNAPSHOT.jar!/:0.0.1-SNAPSHOT]
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:91) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.protocol.dubbo.filter.FutureFilter.invoke(FutureFilter.java:53) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:91) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.filter.ConsumerContextFilter.invoke(ConsumerContextFilter.java:48) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.protocol.ProtocolFilterWrapper$1.invoke(ProtocolFilterWrapper.java:91) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.protocol.InvokerWrapper.invoke(InvokerWrapper.java:53) ~[dubbo-2.5.3.jar!/:2.5.3]
at com.alibaba.dubbo.rpc.cluster.support.FailoverClusterInvoker.doInvoke(FailoverClusterInvoker.java:77) ~[dubbo-2.5.3.jar!/:2.5.3]
... 20 more
为什么会出现发送字符串的异常?在dubbo官网查询到该问题的分析讨论以及解决方法:https://github.com/apache/dubbo/issues/2006
该bug已经在2.5.6中修复,修复方法为:在响应出现异常时复位Buffer。ExchangeCodec.encodeResponse方法中如果出现异常需要复位Buffer。
查看服务方使用的dubbo是2.5.3版本。为了验证结论使用该版本的dubbo成功复现了问题
复现步骤:
复现代码:
// 提供者实现
@Component
public class ITestProviderImpl implements ITestProvider {
@Override
public List test() {
List data = Lists.newLinkedList();
for (int i=0;i<1;i++){
ITestResultDTO result = new ITestResultDTO();
data.add(result);
try {
TimeUnit.MILLISECONDS.sleep(10);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
new Thread(() -> {
data.forEach(iTestResultDTO -> {
List subData = Lists.newLinkedList();
iTestResultDTO.setData(subData);
int j = 0;
while(true){
if (j%2==0) {
ITestResultSubDTO result = new ITestResultSubDTO();
subData.add(result);
} else {
subData.remove(0);
}
}
});
}).start();
return data;
}
}
// 消费者,一定确认消费者不要走injvm协议,一定走remote远程协议,不然无法复现
public class ITestProviderTest extends UnitTestBase {
@Resource
private ITestProvider myTestProviderImpl;
@Test
public void test() throws InterruptedException {
List data = null;
try {
data = myTestProviderImpl.test();
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(data);
// 重中之重,没有这个休眠是不能复现问题的
Thread.sleep(1000 * 60);
}
}