前段时间出现了一个小的线上死锁问题,推了一版新代码上线发现业务服务启动时间长的有些异常,看log发现一直卡着也没有报错,简单看了dump文件发现是skywalking的agent和主线程死锁了,简单记录一下
背景:JDK21, Spring Boot3.x skywalking 9.4.0
Found one Java-level deadlock:
=============================
"main":
waiting to lock monitor 0x00007fd8b588d4e0 (object 0x000000008b300030, a java.lang.Object),
which is held by "SkywalkingAgent-5-JVMService-produce-0"
"SkywalkingAgent-5-JVMService-produce-0":
waiting to lock monitor 0x00007fd7dc023e70 (object 0x0000000087d68b68, a java.util.jar.JarFile),
which is held by "main"
Java stack information for the threads listed above:
===================================================
"main":
at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull([email protected]/BuiltinClassLoader.java:651)
- waiting to lock <0x000000008b300030> (a java.lang.Object)
at jdk.internal.loader.BuiltinClassLoader.loadClass([email protected]/BuiltinClassLoader.java:639)
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass([email protected]/ClassLoaders.java:188)
at java.lang.ClassLoader.loadClass([email protected]/ClassLoader.java:526)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$LazyTypeDescription$RecordComponentToken.<init>(TypePool.java:6117)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$TypeExtractor$RecordComponentExtractor.visitEnd(TypePool.java:8946)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.readRecordComponent(ClassReader.java:1053)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:732)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.parse(TypePool.java:880)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.doDescribe(TypePool.java:864)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.access$001(TypePool.java:944)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.doResolve(TypePool.java:1042)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution$LazyTypeDescription.delegate(TypePool.java:1111)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.description.type.TypeDescription$AbstractBase$OfSimpleType$WithDelegation.getModifiers(TypeDescription.java:8535)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.SWDescriptionStrategy$SWTypeDescriptionWrapper.getModifiers(SWDescriptionStrategy.java:353)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:60)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:27)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$ForNonNullValues.matches(ElementMatcher.java:249)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$Disjunction.matches(ElementMatcher.java:214)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$RawMatcher$ForElementMatchers.matches(AgentBuilder.java:1971)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doTransform(AgentBuilder.java:12425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12385)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.access$1800(AgentBuilder.java:12094)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12872)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12804)
at java.security.AccessController.executePrivileged([email protected]/AccessController.java:778)
at java.security.AccessController.doPrivileged([email protected]/AccessController.java:400)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doPrivileged(AgentBuilder.java)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12328)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$ByteBuddy$ModuleSupport.transform(Unknown Source)
at sun.instrument.TransformerManager.transform([email protected]/TransformerManager.java:188)
at sun.instrument.InstrumentationImpl.transform([email protected]/InstrumentationImpl.java:610)
at jdk.internal.misc.ThreadTracker.begin([email protected]/ThreadTracker.java:70)
at java.util.jar.JarFile.beginInit([email protected]/JarFile.java:1055)
at java.util.jar.JarFile.ensureInitialization([email protected]/JarFile.java:1069)
- locked <0x0000000087d68b68> (a java.util.jar.JarFile)
at java.util.jar.JavaUtilJarAccessImpl.ensureInitialization([email protected]/JavaUtilJarAccessImpl.java:42)
at jdk.internal.loader.URLClassPath$JarLoader$2.getManifest([email protected]/URLClassPath.java:852)
at jdk.internal.loader.BuiltinClassLoader.defineClass([email protected]/BuiltinClassLoader.java:848)
at jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull([email protected]/BuiltinClassLoader.java:760)
at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull([email protected]/BuiltinClassLoader.java:681)
- locked <0x000000008b300d60> (a java.lang.Object)
at jdk.internal.loader.BuiltinClassLoader.loadClass([email protected]/BuiltinClassLoader.java:639)
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass([email protected]/ClassLoaders.java:188)
at java.lang.ClassLoader.loadClass([email protected]/ClassLoader.java:526)
at java.lang.Class.forName0([email protected]/Native Method)
at java.lang.Class.forName([email protected]/Class.java:534)
at java.lang.Class.forName([email protected]/Class.java:513)
at sun.launcher.LauncherHelper.loadMainClass([email protected]/LauncherHelper.java:811)
at sun.launcher.LauncherHelper.checkAndLoadMain([email protected]/LauncherHelper.java:706)
"SkywalkingAgent-5-JVMService-produce-0":
at java.util.zip.ZipFile.getEntry([email protected]/ZipFile.java:337)
- waiting to lock <0x0000000087d68b68> (a java.util.jar.JarFile)
at java.util.jar.JarFile.getEntry([email protected]/JarFile.java:517)
at java.util.jar.JarFile.getJarEntry([email protected]/JarFile.java:472)
at jdk.internal.loader.URLClassPath$JarLoader.getResource([email protected]/URLClassPath.java:923)
at jdk.internal.loader.URLClassPath.getResource([email protected]/URLClassPath.java:316)
at jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull([email protected]/BuiltinClassLoader.java:757)
at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull([email protected]/BuiltinClassLoader.java:681)
- locked <0x000000008b300030> (a java.lang.Object)
at jdk.internal.loader.BuiltinClassLoader.loadClass([email protected]/BuiltinClassLoader.java:639)
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass([email protected]/ClassLoaders.java:188)
at java.lang.ClassLoader.loadClass([email protected]/ClassLoader.java:526)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$LazyTypeDescription$RecordComponentToken.<init>(TypePool.java:6117)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$TypeExtractor$RecordComponentExtractor.visitEnd(TypePool.java:8946)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.readRecordComponent(ClassReader.java:1053)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:732)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.parse(TypePool.java:880)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.doDescribe(TypePool.java:864)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.access$001(TypePool.java:944)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.doResolve(TypePool.java:1042)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution$LazyTypeDescription.delegate(TypePool.java:1111)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.description.type.TypeDescription$AbstractBase$OfSimpleType$WithDelegation.getModifiers(TypeDescription.java:8535)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.SWDescriptionStrategy$SWTypeDescriptionWrapper.getModifiers(SWDescriptionStrategy.java:353)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:60)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:27)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$ForNonNullValues.matches(ElementMatcher.java:249)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$Disjunction.matches(ElementMatcher.java:214)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$RawMatcher$ForElementMatchers.matches(AgentBuilder.java:1971)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doTransform(AgentBuilder.java:12425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12385)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.access$1800(AgentBuilder.java:12094)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12872)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12804)
at java.security.AccessController.executePrivileged([email protected]/AccessController.java:778)
at java.security.AccessController.doPrivileged([email protected]/AccessController.java:400)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doPrivileged(AgentBuilder.java)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12328)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$ByteBuddy$ModuleSupport.transform(Unknown Source)
at sun.instrument.TransformerManager.transform([email protected]/TransformerManager.java:188)
at sun.instrument.InstrumentationImpl.transform([email protected]/InstrumentationImpl.java:610)
at java.nio.DirectByteBuffer.<init>([email protected]/DirectByteBuffer.java:128)
at java.nio.ByteBuffer.allocateDirect([email protected]/ByteBuffer.java:360)
at sun.nio.ch.Util.getTemporaryDirectBuffer([email protected]/Util.java:242)
at sun.nio.ch.IOUtil.read([email protected]/IOUtil.java:303)
at sun.nio.ch.IOUtil.read([email protected]/IOUtil.java:283)
at sun.nio.ch.FileChannelImpl.read([email protected]/FileChannelImpl.java:234)
- locked <0x000000008b32ed08> (a java.lang.Object)
at sun.nio.ch.ChannelInputStream.read([email protected]/ChannelInputStream.java:74)
at sun.nio.ch.ChannelInputStream.read([email protected]/ChannelInputStream.java:103)
- locked <0x000000008b32ee78> (a sun.nio.ch.ChannelInputStream)
at sun.nio.cs.StreamDecoder.readBytes([email protected]/StreamDecoder.java:350)
at sun.nio.cs.StreamDecoder.implRead([email protected]/StreamDecoder.java:393)
at sun.nio.cs.StreamDecoder.lockedRead([email protected]/StreamDecoder.java:217)
at sun.nio.cs.StreamDecoder.read([email protected]/StreamDecoder.java:171)
at java.io.InputStreamReader.read([email protected]/InputStreamReader.java:188)
at java.io.BufferedReader.fill([email protected]/BufferedReader.java:160)
at java.io.BufferedReader.implReadLine([email protected]/BufferedReader.java:370)
at java.io.BufferedReader.readLine([email protected]/BufferedReader.java:347)
at java.io.BufferedReader.readLine([email protected]/BufferedReader.java:436)
at java.nio.file.Files.readAllLines([email protected]/Files.java:3395)
at java.nio.file.Files.readAllLines([email protected]/Files.java:3433)
at jdk.internal.platform.CgroupUtil.lambda$readAllLinesPrivileged$2([email protected]/CgroupUtil.java:83)
at jdk.internal.platform.CgroupUtil$$Lambda/0x00007fd8441f9ab0.run([email protected]/Unknown Source)
at java.security.AccessController.executePrivileged([email protected]/AccessController.java:809)
at java.security.AccessController.doPrivileged([email protected]/AccessController.java:571)
at jdk.internal.platform.CgroupUtil.readAllLinesPrivileged([email protected]/CgroupUtil.java:84)
at jdk.internal.platform.CgroupSubsystemFactory.determineType([email protected]/CgroupSubsystemFactory.java:143)
at jdk.internal.platform.CgroupSubsystemFactory.create([email protected]/CgroupSubsystemFactory.java:85)
at jdk.internal.platform.CgroupMetrics.getInstance([email protected]/CgroupMetrics.java:193)
at jdk.internal.platform.SystemMetrics.instance([email protected]/SystemMetrics.java:29)
at jdk.internal.platform.Metrics.systemMetrics([email protected]/Metrics.java:58)
at jdk.internal.platform.Container.metrics([email protected]/Container.java:43)
at com.sun.management.internal.OperatingSystemImpl.<init>([email protected]/OperatingSystemImpl.java:175)
at com.sun.management.internal.PlatformMBeanProviderImpl.getOperatingSystemMXBean([email protected]/PlatformMBeanProviderImpl.java:280)
- locked <0x000000008b335430> (a java.lang.Class for com.sun.management.internal.PlatformMBeanProviderImpl)
at com.sun.management.internal.PlatformMBeanProviderImpl$3.nameToMBeanMap([email protected]/PlatformMBeanProviderImpl.java:199)
at sun.management.spi.PlatformMBeanProvider$PlatformComponent.getMBeans([email protected]/PlatformMBeanProvider.java:195)
at java.lang.management.ManagementFactory.getPlatformMXBean([email protected]/ManagementFactory.java:691)
at java.lang.management.ManagementFactory.getOperatingSystemMXBean([email protected]/ManagementFactory.java:391)
at org.apache.skywalking.apm.agent.core.os.ProcessorUtil.getNumberOfProcessors(ProcessorUtil.java:25)
at org.apache.skywalking.apm.agent.core.jvm.cpu.CPUProvider.<init>(CPUProvider.java:31)
at org.apache.skywalking.apm.agent.core.jvm.cpu.CPUProvider.<clinit>(CPUProvider.java:27)
at org.apache.skywalking.apm.agent.core.jvm.JVMService.run(JVMService.java:101)
at org.apache.skywalking.apm.util.RunnableWithExceptionProtection.run(RunnableWithExceptionProtection.java:33)
at java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:572)
at java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:358)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1144)
at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:642)
at java.lang.Thread.runWith([email protected]/Thread.java:1596)
at java.lang.Thread.run([email protected]/Thread.java:1583)
Found 1 deadlock.text发生的逻辑大概如下: 线程 A (业务主线程/Spring 启动线程):
-
正在加载某个业务类,拿住了 ClassLoader 的锁。
-
加载过程中可能触发了某些监控逻辑,或者由于并发竞争,它需要等待管理系统的某些状态。
线程 B (SkyWalking JVMService 线程):
-
执行 CPUProvider 的静态初始化。
-
调用 ManagementFactory.getOperatingSystemMXBean()。
在 JDK 21 中,这个调用会触发 sun.management.ManagementFactoryHelper 或相关底层类的首次加载。
线程 B 去请求 ClassLoader 的锁。
然后就死锁了
应该是JDK21后类加载 JarFile 初始化路径变长或者其他原因导致的,出现的概率大概有1/10?
去issue找了找,结果是人家不打算修= =

解决方案大概可以配置 Agent 延迟启动,缓解一下,不过出现也不频繁,先不动了ƪ(˘⌣˘)ʃ
