目录
相信jar包冲突问题是Java工程师经常遇到的问题之一。
说来惭愧,作为一名多年coding经验的老工程师,之前一直得过且过,没有仔细分析这个问题。
经过一轮新的迭代,各功能在开发和测试环境验证均没问题,但上线后突然报错,关键日志摘录如下:
- Caused by: java.lang.NoSuchMethodError: reactor.core.publisher.Mono.contextWrite(Lreactor/util/context/ContextView;)Lreactor/core/publisher/Mono;
- at com.shaded.azure.core.http.rest.RestProxy.handleRestReturnType(RestProxy.java:548)
- at com.shaded.azure.core.http.rest.RestProxy.invoke(RestProxy.java:148)
- at com.sun.proxy.$Proxy182.upload(Unknown Source)
- at com.shaded.azure.storage.blob.implementation.BlockBlobsImpl.uploadWithResponseAsync(BlockBlobsImpl.java:395)
- ... 77 more
排查发现是该轮迭代新增的存储功能抛错,项目依赖了某个SDK,该SDK对用户屏蔽了底层存储(Ceph/Azure Blob/AWS S3等)的差异,但是测试环境和生产环境的底层存储介质不同,测试阶段没有覆盖到这些不同的介质。
这就是jar包冲突的经典在线场景:1、Java项目编译不报错;2、在A环境下运行正常;3、切换到B环境后运行报错(NoSuchMethodError、ClassNotFoundException等最常见)。
首先梳理了下我司现有Java项目构建发布的流程(我司依赖某款自研工具进行自动进行),该流程分位两个阶段,构建和部署,截取构建阶段关键信息如下:
- ------------------------- [ Check Out Code ] -------------------------
- check out code 1660554686391
- /xxxxxx/281534
- Cloning into 'xxxxxxx'...
- Note: checking out 'f649b4102ff5266f5d213c9e6562831fb237adcd'.
-
- You are in 'detached HEAD' state. You can look around, make experimental
- changes and commit them, and you can discard any commits you make in this
- state without impacting any branches by performing another checkout.
-
- If you want to create a new branch to retain commits you create, you may
- do so (now or later) by using -b with the checkout command again. Example:
-
- git checkout -b
-
- HEAD is now at f649b410 refine es code
- ------------------------- [ Finish Check Out Code ] -------------------------
-
-
- ------------------------- [ Build Package ] -------------------------
- build package 1660554688525
-
- ------------------------- [ Excute mvn clean package ..... ] -------------------------
- mvn clean package -U -B -DskipTests
- [INFO] Scanning for projects...
- [INFO] Downloading from nexus: http://nexus.xxx.com/content/groups/public/com/envisioniot/parent-pom/1.0.0-SNAPSHOT/maven-metadata.xml
- [INFO] Downloaded from nexus: http://nexus.xxx.com/content/groups/public/com/envisioniot/parent-pom/1.0.0-SNAPSHOT/maven-metadata.xml (604 B at 3.2 kB/s)
-
- .......
-
- ------------------------- [ Finish Build Package ] -------------------------
- finish build package 1660554791437
-
-
- ------------------------- [ Build Docker Image ] -------------------------
- build docker image 1660554791441
- WARNING! Using --password via the CLI is insecure. Use --password-stdin.
- WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
- Configure a credential helper to remove this warning. See
- https://docs.docker.com/engine/reference/commandline/login/#credentials-store
-
- Login Succeeded
-
- ------------------------- [ Building Docker Image harbor-alpha1.xxx.io... ] -------------------------
- docker build -f Dockerfile -t harbor-alpha1.xxx.io/xxx/xxx:feature_branch_2208_20220815091119 . --pull
- Sending build context to Docker daemon 435.9MB
-
- ------------------------- [ Finish Build Docker Image ] -------------------------
- finish build docker image 1660554831194
-
- ------------------------- [ Push Docker Image ] -------------------------
- push docker image 1660554831200
- The push refers to repository [harbor-alpha1.xxx.io/xxx/xxx]
- 74c17a129617: Preparing
-
- ........
-
- 258ddc74925c: Pushed
- feature_branch_2208_20220815091119: digest: sha256:68ecae221c0881ca47180ce73c25671c501fd8f4fa1c32dc9843de875939ca58 size: 2422
-
- ------------------------- [ Finish push Docker Image ] -------------------------
简单分析可见构建阶段大致进行了如下操作:
部署阶段则是将harbor仓库的镜像部署到K8s的过程,这里就不做分析了。
下面开始思考这个问题:
为什么jar包冲突在编译期发现不了,到了运行期才能被被发现呢?
假设我们项目中依赖了A和B两个Jar包。而A和B各自又有以下传递依赖
A -> X -> Z(2.0)B -> X -> Y -> Z(2.5)
那最终系统中Z包就产生了冲突,2.0和2.5两个版本冲突。但是classpath中只会依赖一个版本的Z包。根据传递依赖的最短路径优先原则,最终依赖的应该是2.0版本。
先从Maven工具开始,公开资料显示,几乎所有的Jar包冲突都和依赖传递原则有关:
最短路径优先原则
假如引入了2个Jar包A和B,都传递依赖了Z这个Jar包:
A -> X -> Y -> Z(2.5)B -> X -> Z(2.0)
那其实最终生效的是Z(2.0)这个版本。因为他的路径更加短。如果我本地引用了Z(3.0)的包,那生效的就是3.0的版本。一样的道理。
最先声明优先原则
如果路径长短一样,优先选最先声明的那个。
A -> Z(3.0)B -> Z(2.5)
这里A最先声明,所以传递过来的Z选择用3.0版本的。





最后回到最初的疑问,为什么jar包冲突的问题不会在编译期报错呢,我的判断是:发生冲突的jar包都是些已经编译好的.class压缩包,java工程的编译是个.java -> .class的过程,它在编译期只检查.java文件,假如.java引用了jar包里的A Class的a方法,如果a方法存在,编译就能过。但如果a方法又引用了jar包里其他的b方法,而这个b方法实际存不存在,这就不在编译期的检查范围了。
最后总结一下可能导致NoClassDefFoundError产生的几种原因:
参考资料: