web应用借助于结构:spring mvc + quartz结构,部署到tomcat容器时,shutdown时的error信息:

appears to have started a thread named [schedulerFactoryBean_Worker-1] but has failed to stop it. This is very likely to create a memory leak

quartz的thread终止的确保,加了个servletListener,工作就是显示的使用shutdown方法:

@Override
public void contextDestroyed(ServletContextEvent sce) {
LogAgent.debug(LoggerEnum.SYS_INFO, "QUARTZ.context destroy start.");
WebApplicationContext webApplicationContext = (WebApplicationContext) sce.getServletContext()
.getAttribute(WebApplicationContext.ROOT_WEB_APPLICATION_CONTEXT_ATTRIBUTE); org.springframework.scheduling.quartz.SchedulerFactoryBean fact = (org.springframework.scheduling.quartz.SchedulerFactoryBean) webApplicationContext
.getBean("schedulerFactoryBean");
if (fact != null) {
try {
LogAgent.debug(LoggerEnum.SYS_INFO, "shedule shutdown start.");
fact.getScheduler().shutdown();
LogAgent.debug(LoggerEnum.SYS_INFO, "shedule shutdown end.");
} catch (SchedulerException e) {
LogAgent.error(LoggerEnum.ERROR_INFO, e);
}
}
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
LogAgent.error(LoggerEnum.ERROR_INFO, e);
}
LogAgent.debug(LoggerEnum.SYS_INFO, "QUARTZ.context destroy end.");
}

接着的提示信息:

created a ThreadLocal with key of type [org.apache.commons.lang.builder.ReflectionToStringBuilder$1] (value [org.apache.commons.lang.builder.ReflectionToStringBuilder$1@1a08777c]) and a value of type [java.util.HashSet] (value [[]]) but failed to remove it when the web application was stopped. This is very likely to create a memory leak.

检查了代码,发现日志输出的部分,有个类使用了类:org.apache.commons.lang.builder.ToStringBuilder,其内部实现使用了org.apache.commons.lang.builder.ReflectionToStringBuilder,再跟进的时候发现,ToStringBuilder的append方法,在append和toString的时候,向ToStringStyle register或unregister(就是threadlocal的set obj和 remove);跟踪代码时发现明显该threadlocal在每次toString时都有remove,不应该会出现leak memory的倾向才对(若运行一半异常还是有可能,但该方法就是一个日志记录,所以无这方面的担心),最后检查pom时发现,commons-lang有两个包,分别是appache-lang和commons-lang:jar名都是commons-lang,删掉appache-lang的这个warn警告就没出现了(我也不记得了,为什么会引入两个commons-lang)。

结果再运行时,发现另一个错误了:

信息: Illegal access: this web application instance has been stopped already.  Could not load ch.qos.logback.core.status.WarnStatus.  The eventual following stack trace is caused by an error thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access, and has no functional impact.
java.lang.IllegalStateException
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1588)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1547)
at ch.qos.logback.classic.LoggerContext.noAppenderDefinedWarning(LoggerContext.java:175)
at ch.qos.logback.classic.Logger.callAppenders(Logger.java:267)
at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:442)
at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:396)
at ch.qos.logback.classic.Logger.debug(Logger.java:503)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:612)

首先级别上来看,是信息,不是错误;

但是本着看到error stack就不爽的本质,了解了一把,原因大致如此:

  • quartz的SimpleThreadPool在初始化的时候,启动了一定数量的线程:WorkerThread,然后它就让WorkerThread自己滴去干活了。
  • 然后当容器发出shutdown的信号时,pool就告诉workerthread们,要shutdown了,现在问题来了,看代码:
 /**
* <p>
* Signal the thread that it should terminate.
* </p>
*/
void shutdown() {
run.set(false);
}

很明显,这就是它的shutdown方法,它并不是要自己杀掉自己,而是告知自己的当前状态是false了,再来看看该状态作用的地方:

 @Override
public void run() {
boolean ran = false; while (run.get()) {
try {
synchronized(lock) {
while (runnable == null && run.get()) {
lock.wait(500);
} if (runnable != null) {
ran = true;
runnable.run();
}
}
} catch (InterruptedException unblock) {
// do nothing (loop will terminate if shutdown() was called
try {
getLog().error("Worker thread was interrupt()'ed.", unblock);
} catch(Exception e) {
// ignore to help with a tomcat glitch
}
} catch (Throwable exceptionInRunnable) {
try {
getLog().error("Error while executing the Runnable: ",
exceptionInRunnable);
} catch(Exception e) {
// ignore to help with a tomcat glitch
}
} finally {
synchronized(lock) {
runnable = null;
}
// repair the thread in case the runnable mucked it up...
if(getPriority() != tp.getThreadPriority()) {
setPriority(tp.getThreadPriority());
} if (runOnce) {
run.set(false);
clearFromBusyWorkersList(this);
} else if(ran) {
ran = false;
makeAvailable(this);
} }
} //if (log.isDebugEnabled())
try {
getLog().debug("WorkerThread is shut down.");
} catch(Exception e) {
// ignore to help with a tomcat glitch
}
}

其实就是第5行,对run进行判断,如果是false,就直接退出了。

所以pool的shut通知并不是马上就能被执行,如果线程正好走while判断条件,马上就会被退出;

如果该thread无任务可做时,见第8行:lock.wait(500),所以会wait,这个时候再退出,结果去getLog的时候,loggerContextListener.destroy先执行,无logger了,所以会有该信息。

  • 解决办法就是:

  将log的listener放到最后(如果有多个listener的话):

<listener>
<listener-class>com.xxx.listener.QuartzLitener</listener-class>
</listener>
<listener>
<listener-class>com.xxx.web.listener.LogbackListener</listener-class>
</listener>
05-11 22:38