细读源码是《马士兵教育》旗下赏析源码的栏目。我们赏析源码的目的不是为了炫技,而是为了去理解作者的设计思想,并取其精华,去其糟粕,从而写出更加优秀的代码。
另一方面,也可以给面试加分,代码好坏的评价,不可避免地会代入个人的主观色彩,大家和而不同。
【背景】
最近线上环境出了一次严重的事故,在用户访问高峰期,出现了服务不可用的情况,同时监控系统报出了大量工作线程Block的告警,通过对Block线程的堆栈的分析,最后发现是由于打印异常堆栈日志导致的,还是非常出乎意料的。
本文将从源码的角度分析一下问题发生的原因,主要从以下几个方法进行讲解:
1.模拟线上问题
2.Log4j打印异常堆栈机制
3.Classload加载class锁机制
4.反射调用优化
5.问题总结和解决方案
1.新建myfirstmvn的maven工程
pom.xml文件如下:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>org.example</groupId><artifactId>myfirstmvn</artifactId><version>1.0-SNAPSHOT</version><packaging>jar</packaging><name>myfirstmvn</name><url>http://maven.apache.org</url><properties><project.build.sourceEncoding>UTF-8</project.build.sourceEncoding></properties></project>
2.在myfirstmvn的增加一个MathUtil类
package org.example.myfirstmvn;public class MathUtil {public static int div(int a, int b) {return a b;}}
3.新建log4jtest的maven工程
pom.xml文件如下:
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><modelVersion>4.0.0</modelVersion><groupId>org.example</groupId><artifactId>log4jtest</artifactId><version>1.0-SNAPSHOT</version><dependencies><dependency><groupId>org.example</groupId><artifactId>myfirstmvn</artifactId><version>1.0-SNAPSHOT</version></dependency><dependency><groupId>org.apache.logging.log4j</groupId><artifactId>log4j-api</artifactId><version>2.14.1</version></dependency><dependency><groupId>org.apache.logging.log4j</groupId><artifactId>log4j-core</artifactId><version>2.14.1</version></dependency></dependencies></project>
4.在log4jtest中增加LogTest类
LogTest类用来模拟,多个线程同时打印日志情况,代码如下:
package org.example.sourcecode.threadblock;import org.apache.logging.log4j.LogManager;import org.apache.logging.log4j.Logger;import org.example.myfirstmvn.MathUtil;import java.lang.reflect.Method;public class LogTest {private static Logger logger = LogManager.getLogger(LogManager.ROOT_LOGGER_NAME);public static void main(String[] args) {for (int i = 0; i < 100; i++) {Thread thread = new Thread(new Runnable() {@Overridepublic void run() {while (true) {LogTest.run();}}});thread.start();}}private static void run() {try {callDiv(1, 0);} catch (Exception e) {logger.error("div error", e);}}private static int callDiv(int a, int b) throws Exception {Method method = MathUtil.class.getMethod("div", new Class[]{int.class, int.class});return (int) method.invoke(null, a, b);}}
运行上面的代码,并使用jvisualvm工具来查看线程的运行状态,如下图所示:

从图上可以看出,打印日志的过程中,绝大多数线程处于BLOCKED (on object monitor)的状态,通过线程Dump,查看处于Block状态的线程堆栈,如下所示:
"Thread-100" #113 prio=5 os_prio=31 tid=0x00007f8d92021000 nid=0x11a03 waiting for monitor entry [0x000070000eed3000]java.lang.Thread.State: BLOCKED (on object monitor)at java.lang.ClassLoader.loadClass(ClassLoader.java:398)- waiting to lock <0x00000006c000ff28> (a java.lang.Object)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)at java.lang.ClassLoader.loadClass(ClassLoader.java:351)at java.lang.Class.forName0(Native Method)at java.lang.Class.forName(Class.java:264)at org.apache.logging.log4j.util.LoaderUtil.loadClass(LoaderUtil.java:171)at org.apache.logging.log4j.core.impl.ThrowableProxyHelper.loadClass(ThrowableProxyHelper.java:216)at org.apache.logging.log4j.core.impl.ThrowableProxyHelper.toExtendedStackTrace(ThrowableProxyHelper.java:114)at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:110)at org.apache.logging.log4j.core.impl.ThrowableProxy.<init>(ThrowableProxy.java:94)at org.apache.logging.log4j.core.impl.MutableLogEvent.getThrownProxy(MutableLogEvent.java:347)at org.apache.logging.log4j.core.pattern.ExtendedThrowablePatternConverter.format(ExtendedThrowablePatternConverter.java:63)at org.apache.logging.log4j.core.pattern.PatternFormatter.format(PatternFormatter.java:38)at org.apache.logging.log4j.core.layout.PatternLayout$PatternSerializer.toSerializable(PatternLayout.java:344)at org.apache.logging.log4j.core.layout.PatternLayout.toText(PatternLayout.java:244)at org.apache.logging.log4j.core.layout.PatternLayout.encode(PatternLayout.java:229)at org.apache.logging.log4j.core.layout.PatternLayout.encode(PatternLayout.java:59)at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.directEncodeEvent(AbstractOutputStreamAppender.java:197)at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.tryAppend(AbstractOutputStreamAppender.java:190)at org.apache.logging.log4j.core.appender.AbstractOutputStreamAppender.append(AbstractOutputStreamAppender.java:181)at org.apache.logging.log4j.core.config.AppenderControl.tryCallAppender(AppenderControl.java:156)at org.apache.logging.log4j.core.config.AppenderControl.callAppender0(AppenderControl.java:129)at org.apache.logging.log4j.core.config.AppenderControl.callAppenderPreventRecursion(AppenderControl.java:120)at org.apache.logging.log4j.core.config.AppenderControl.callAppender(AppenderControl.java:84)at org.apache.logging.log4j.core.config.LoggerConfig.callAppenders(LoggerConfig.java:540)at org.apache.logging.log4j.core.config.LoggerConfig.processLogEvent(LoggerConfig.java:498)at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:481)at org.apache.logging.log4j.core.config.LoggerConfig.log(LoggerConfig.java:456)at org.apache.logging.log4j.core.config.DefaultReliabilityStrategy.log(DefaultReliabilityStrategy.java:63)at org.apache.logging.log4j.core.Logger.log(Logger.java:161)at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2205)at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2159)at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2142)at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2017)at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1983)at org.apache.logging.log4j.spi.AbstractLogger.error(AbstractLogger.java:750)at org.example.sourcecode.threadblock.LogTest.run(LogTest.java:34)at org.example.sourcecode.threadblock.LogTest.access$000(LogTest.java:13)at org.example.sourcecode.threadblock.LogTest$1.run(LogTest.java:22)at java.lang.Thread.run(Thread.java:748)
从上面的堆栈可以看出,是调用ClassLoader.loadClass方法,导致线程Block。
但是为什么Log4j打印异常堆栈,会触发ClassLoader.loadClass的调用呢,ClassLoader.loadClass的锁又是怎么回事呢,下面的内容会一一进行讲解。
首先我们看一下先看一下Log4j的Logger.error和e.printStackTrace输出内容的差异。
1.Logger.error
package org.example.sourcecode.threadblock;import org.apache.logging.log4j.LogManager;import org.apache.logging.log4j.Logger;import org.example.myfirstmvn.MathUtil;import java.lang.reflect.Method;public class Log2Test {private static Logger logger = LogManager.getLogger(LogManager.ROOT_LOGGER_NAME);public static void main(String[] args) {try {callDiv(1, 0);} catch (Exception e) {logger.error("div error", e);}}private static int callDiv(int a, int b) throws Exception {Method method = MathUtil.class.getMethod("div", new Class[]{int.class, int.class});return (int) method.invoke(null, a, b);}}
上面代码执行结果:
java.lang.reflect.InvocationTargetException: nullat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_261]at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_261]at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_261]at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_261]at org.example.sourcecode.threadblock.Log2Test.callDiv(Log2Test.java:23) ~[classes/:?]at org.example.sourcecode.threadblock.Log2Test.main(Log2Test.java:15) [classes/:?]Caused by: java.lang.ArithmeticException: by zeroat org.example.myfirstmvn.MathUtil.div(MathUtil.java:9) ~[myfirstmvn-1.0-SNAPSHOT.jar:?]... 6 more
2. e.printStackTrace
package org.example.sourcecode.threadblock;import org.example.myfirstmvn.MathUtil;import java.lang.reflect.Method;public class Log3Test {public static void main(String[] args) {try {callDiv(1, 0);} catch (Exception e) {e.printStackTrace();}}private static int callDiv(int a, int b) throws Exception {Method method = MathUtil.class.getMethod("div", new Class[]{int.class, int.class});return (int) method.invoke(null, a, b);}}
上面代码执行结果:
java.lang.reflect.InvocationTargetExceptionat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:498)at org.example.sourcecode.threadblock.Log3Test.callDiv(Log3Test.java:18)at org.example.sourcecode.threadblock.Log3Test.main(Log3Test.java:10)Caused by: java.lang.ArithmeticException: by zeroat org.example.myfirstmvn.MathUtil.div(MathUtil.java:9)... 6 more
对比上面两部分代码的执行结果,发现Logger.error打印的堆栈信息更加完善。JDK包中的类打印了JDK版本[?:1.8.0_261],class path下的类打印了[classes/:?],其他jar文件中的类打印了类所在的jar文件名[myfirstmvn-1.0-SNAPSHOT.jar:?]。
而这些信息是通过log4j-core.jar包中的
ThrowableProxyHelper.toCacheEntry生成的,不同的Log4j版本会有差异。
低版本的Log4j没有ThrowableProxyHelper类,toCacheEntry方法在ThrowableProxy类中。
参考toCacheEntry方法的实现,写了一个展示类位置信息的测试程序,代码如下:
package org.example.sourcecode.threadblock;import org.example.myfirstmvn.MathUtil;import java.net.URL;import java.security.CodeSource;public class ClassLocationTest {public static void main(String[] args) {System.out.println(displayClassLocation(String.class));System.out.println(displayClassLocation(ClassLocationTest.class));System.out.println(displayClassLocation(MathUtil.class));}private static String displayClassLocation(final Class<?> callerClass) {String location = "?";String version = "?";if (callerClass != null) {try {final CodeSource source = callerClass.getProtectionDomain().getCodeSource();if (source != null) {final URL locationURL = source.getLocation();if (locationURL != null) {final String str = locationURL.toString().replace('\\', '/');int index = str.lastIndexOf("/");if (index >= 0 && index == str.length() - 1) {index = str.lastIndexOf("/", index - 1);location = str.substring(index + 1);} else {location = str.substring(index + 1);}}}} catch (final Exception ex) {// Ignore the exception.}final Package pkg = callerClass.getPackage();if (pkg != null) {final String ver = pkg.getImplementationVersion();if (ver != null) {version = ver;}}}return "[" + location + " " + version + "]";}}
上面代码执行结果:
[? 1.8.0_261][classes/ ?][myfirstmvn-1.0-SNAPSHOT.jar ?]
想要获得类所在位置的辅助信息,必须拿到Class对象,而堆栈对象StackTraceElement定义的declaringClass是String类型,想要得到相应的Class对象,必须使用Classloader进行类加载,从而触发了Classloader.loadClass方法的执行。
1.Classloader.loadClass锁分析
Classload调用loadClass进行类加载的过程是线程安全的,loadClass的代码如下:
protected Class<?> loadClass(String name, boolean resolve)throws ClassNotFoundException {synchronized (getClassLoadingLock(name)) {// First, check if the class has already been loadedClass<?> c = findLoadedClass(name);if (c == null) {long t0 = System.nanoTime();try {if (parent != null) {c = parent.loadClass(name, false);} else {c = findBootstrapClassOrNull(name);}} catch (ClassNotFoundException e) {// ClassNotFoundException thrown if class not found// from the non-null parent class loader}if (c == null) {// If still not found, then invoke findClass in order// to find the class.long t1 = System.nanoTime();c = findClass(name);// this is the defining class loader; record the statssun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0);sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1);sun.misc.PerfCounter.getFindClasses().increment();}}if (resolve) {resolveClass(c);}return c;}}
进入loadClass方法,就是synchronized代码块,synchronized的对象是通过getClassLoadingLock返回的,其代码如下:
protected Object getClassLoadingLock(String className) {Object lock = this;if (parallelLockMap != null) {Object newLock = new Object();lock = parallelLockMap.putIfAbsent(className, newLock);if (lock == null) {lock = newLock;}}return lock;}
通过分析getClassLoadingLock可以得出,loadClass的同步工作模式有两种:
1. 串行,synchronized对象是Classloader的this对象,在串行模式下执行,当classloader在加载一个类的时候,其他线程要进行类加载的时候,都必须等待;
2.并行,synchronized对象是从parallelLockMap中根据className查询得到的Object对象,在并行模式下执行,类名完全相同的类还是串行加载,类名不同的类之间是并行加载,加载过程互不影响,这样可以提供加载效率。
2.指定Classloader的加载的串并行模式
可以通过ClassLoader.registerAsParallelCapable()方法将调用次方法的Classloader注册为并行加载模式,如果不指定,默认是串行加载模式。最常用的AppClassLoader就是并行加载模式,其静态代码块如下所示:
static {ClassLoader.registerAsParallelCapable();}
3.Classloader加载无效类效率
由于Classloader的加载机制,当Classloader去加载一个他无法加载的类时,loadClass的效率就会变得非常低下,举例说明:
package org.example.sourcecode.threadblock;import java.io.ByteArrayOutputStream;import java.io.FileInputStream;import java.io.IOException;public class TryLoadClass {private static int COUNTER = 100000;public static void main(String[] args) throws IOException, ClassNotFoundException {ClassLoader loader = ClassLoader.getSystemClassLoader();for (int i = 0; i < 1000; i++) {long time1 = tryLoadClass(loader, "org.example.sourcecode.threadblock.TryLoadClass");long time2 = tryLoadClass(loader, "org.example.sourcecode.threadblock.TryLoadClass2");System.out.println(time1 + " " + time2);}}public static long tryLoadClass(ClassLoader classLoader, String classFullName) {long start = System.currentTimeMillis();for (int i = 0; i < COUNTER; i++) {try {classLoader.loadClass(classFullName);} catch (Exception e) {}}return System.currentTimeMillis() - start;}}
上面代码执行结果:
71 277249 268748 261749 276948 262348 2611
org.example.sourcecode.threadblock.TryLoadClass2这个类不存在,加载效率比加载存在的类慢了几十倍。
加载过程执行的越慢,高并发的条件下,就越可能发生线程Block情况。
四.反射调用优化
在《细读源码之JAVA反射方法调用优化》详细讲解了反射优化过程,最后动态生成的二进制Class文件,交给了ClassDefiner进行处理,ClassDefiner代码如下:
class ClassDefiner {static final Unsafe unsafe = Unsafe.getUnsafe();ClassDefiner() {}static Class<?> defineClass(String var0, byte[] var1, int var2, int var3, final ClassLoader var4) {ClassLoader var5 = (ClassLoader)AccessController.doPrivileged(new PrivilegedAction<ClassLoader>() {public ClassLoader run() {return new DelegatingClassLoader(var4);}});return unsafe.defineClass(var0, var1, var2, var3, var5, (ProtectionDomain)null);}}
最后执行动态二进制文件加载的Classloader是DelegatingClassLoader,其代码如下:
class DelegatingClassLoader extends ClassLoader {DelegatingClassLoader(ClassLoader var1) {super(var1);}}
分析上面的代码,我们可以得出下面的结论:反射优化的动态生成的Class,是不能被其他Classloader(如:AppClassloader)加载的,这个如果无法理解,请参加马老师的《JVM从入门到精通》课程,里面更详细的关于Classloader的内容。
五.问题总结和解决方案
1.使用Log4j的Logger.error去打印异常堆栈的时候,为了打印出堆栈中类的位置信息,需要使用Classloader进行类加载;
2.Classloader加载是线程安全的,虽然并行加载可以提高加载不同类的效率,但是多线程加载相同的类时,还是需要互相同步等待,尤其当不同的线程打印的异常堆栈完全相同时,就会增加线程Block的风险,而Classloader去加载一个无法加载的类时,效率会急剧下降,使线程Block的情况进一步恶化;
3.因为反射调用效率问题,JDK对反射调用进行了优化,动态生成Java类进行方法调用,替换原来的native调用,而生成的动态类是由DelegatingClassLoader进行加载的,不能被其他的Classloader加载,异常堆栈中有反射优化的动态类,在高并发的条件下,就非常容易产生线程Block的情况。
2.解决方案
A.去掉不必要的异常堆栈打印
对于可以确定问题原因的异常,就没有必要打印异常堆栈信息,示例代码如下:
package org.example.sourcecode.threadblock;import org.apache.logging.log4j.LogManager;import org.apache.logging.log4j.Logger;import java.util.Scanner;public class DisplayStackTrace {private static Logger logger = LogManager.getLogger(LogManager.ROOT_LOGGER_NAME);public static void main(String[] args) {Scanner cin = new Scanner(System.in);String number = cin.nextLine();try {System.out.println(Integer.parseInt(number) + 100);} catch (Exception e) {logger.error("parse int error : " + number, e);}}}
如果Integer.parseInt发生异常,导致异常原因肯定是出入的number不合法,在这种情况下,打印异常堆栈完全没有必要,可以去掉堆栈的打印。
B.将堆栈信息转换为字符串再打印
package org.example.sourcecode.threadblock;import org.apache.logging.log4j.LogManager;import org.apache.logging.log4j.Logger;import org.example.myfirstmvn.MathUtil;import java.io.PrintWriter;import java.io.StringWriter;import java.lang.reflect.Method;public class Log2Test {private static Logger logger = LogManager.getLogger(LogManager.ROOT_LOGGER_NAME);public static void main(String[] args) {try {callDiv(1, 0);} catch (Exception e) {logger.error(convertStackTraceToString(e));}}private static String convertStackTraceToString(Throwable throwable) {StringWriter stringWriter = new StringWriter();throwable.printStackTrace(new PrintWriter(stringWriter));return stringWriter.toString();}private static int callDiv(int a, int b) throws Exception {Method method = MathUtil.class.getMethod("div", new Class[]{int.class, int.class});return (int) method.invoke(null, a, b);}}
调用convertStackTraceToString将异常堆栈转换为字符串,但是字符串中就没有了类所在位置的辅助信息,这是一个取舍的过程,没有一个标准的答案,需要根据业务系统的实际情况做最后的决策。
C.禁用反射优化
使用Log4j打印堆栈信息,如果堆栈中有反射优化生成的动态代理类,这个代理类不能被其他的Classloader加载,这个时候打印堆栈,会严重影响执行效率。但是禁用反射优化也会有副作用,导致反射执行的效率降低。
并不是所有问题都有那种完美的解决方案,都是各方博弈得到的相对平衡的结果。对打印日志而言,就是执行效率,磁盘空间和排查问题难度几方博弈的结果,最终选择适合自己业务场景的方案。
好了,本期的《细读源码》就先告一段落了,记得关注我们更新更多干货内容~






