问题描述
我们有一个 web 应用程序,它在空闲时平均使用 20% 的 CPU,没有网络流量或任何类型的请求.它在 Java 11、Tomcat 9、Spring Framework 5.3、Hibernate 5.4 上运行.然而,我将在下面描述的问题在 Java 8、Tomcat 8.5、Spring 4.3 和 Hibernate 4 上也是如此.我尝试使用 JFR 和 JMC 分析应用程序,并尝试了很多配置.在上图中,catalina-utility-1 和 catalina-utility-2 线程会定期唤醒,并在几秒钟内使用大量 CPU.此外,这些线程似乎还进行了大量内存分配,在 5 分钟的采样间隔内总共分配了 30+ GB.
We have a webapp that uses on average 20% CPU when idle, with no network traffic or any kind of requests.It is running on Java 11, Tomcat 9, Spring Framework 5.3, Hibernate 5.4.However the issues I will describe below were true on Java 8, Tomcat 8.5, Spring 4.3 and Hibernate 4. as well.I tried to profile the application using JFR and JMC, and I experimented with a lot of configurations.In the image above it looks like catalina-utility-1 and catalina-utility-2 threads wake up periodically and for a few seconds use a lot of CPU.Also there seems to be a huge amount of memory allocations done by these threads, 30+ GB in total in the sampled 5 minutes interval.
对于此分析,我已将 JFR 配置为最大记录所有内容,并启用所有选项.
For this profiling I've configured JFR to record everything at maximum, all options enabled.
当我试图通过查看 Method Profiling 细节来深入挖掘细节时,我发现它似乎与 org.apache.catalina.webresources.Cache.getResource()
相关.
When I tried to dig deeper into the details by looking at the Method Profiling details, I observed that it seems to be related to org.apache.catalina.webresources.Cache.getResource()
.
所以我开始阅读 Tomcat 缓存并尝试不同的参数来通过像这样的 context.xml
文件来调整它:
So I started to read about Tomcat caching and tried out different parameters to tune it via the context.xml
file like this:
<Context>
<!-- Default set of monitored resources. If one of these changes, the -->
<!-- web application will be reloaded. -->
<WatchedResource>WEB-INF/web.xml</WatchedResource>
<WatchedResource>WEB-INF/tomcat-web.xml</WatchedResource>
<WatchedResource>${catalina.base}/conf/web.xml</WatchedResource>
<!-- Uncomment this to disable session persistence across Tomcat restarts -->
<!--
<Manager pathname="" />
-->
<Resources cachingAllowed="true" cacheMaxSize="3024000" cacheObjectMaxSize="10240" cacheTtl="10000"/>
</Context>
在这个用于 JFR 分析的特定示例中,我将缓存大小增加到 3GB,将 cacheTtl
增加到 10 秒.我认为更大的缓存和更大的 TTL 会影响 CPU 峰值的间隔,因为我怀疑 Tomcat 每 5 秒检查一次缓存(最初大小为 1G),这是默认设置.但是,无论我为缓存大小或 ttl 设置什么值,周期性的 CPU 峰值都是相同的.并且缓存大小足以容纳 Tomcat 想要放入的任何内容,因为在我们看到日志中的警告后我增加了该值.无论如何,1GB 足以摆脱警告.
In this particular example, which is the one used for the JFR profiling, I increased cache size to 3GB and cacheTtl
to 10 seconds. I thought that a larger cache and larger TTL would effect the interval of the CPU spikes because I suspected Tomcat was checking the cache (originally 1G in size) every 5 seconds which is the default.However, whatever values I set for cache size or ttl, the periodical CPU spikes are identical.And the cache size is big enough to hold whatever Tomcat wants to put in there because I increased the value after we saw warnings in the logs. Anyways, 1GB is enough to get rid of the warnings.
我还尝试了 1 到 5GB 的堆大小,上面的分析是用 5GB 的堆大小完成的.在不开始达到物理内存限制的情况下,我真的无法超过这个值.
I also experimented with heap sizes ranging from 1 to 5GB, the profiling above was done with a 5GB heap size. I can't really go above this value without starting to hit physical memory limits.
自 Java 8 天以来,我们使用 G1GC 作为我们的垃圾收集器.调整其参数不会影响 CPU 使用率.我也尝试过 ParallelGC 和 SerialGC,但 CPU 使用模式保持不变.
We use G1GC as our garbage collector since the Java 8 days. Tuning its parameters did not effect CPU usage.I also tried out ParallelGC and SerialGC but the CPU usage pattern remained unchanged.
在 Google 上搜索此类问题导致没有结果,我完全陷入了我还可以尝试或应该查看什么的问题上.
Searching Google for this kind of issues lead to no results and I am totally stuck on what else could I try or what else should I look at.
欢迎提出任何建议.谢谢.
Any suggestions are welcome. Thanks.
更新 1:
似乎我最初遇到了格式问题,并且解析时context.xml
中缺少开头的<context>
标记.修好了.
It seems like I had a formatting issue originally, and the opening <context>
tag was missing from the context.xml
when parsed. Fixed it.
我也尝试过,正如 <Context reloadable="false">
所建议的那样,以便 reloadable 显式设置为 false.完全没有效果.
I also tried, as suggested with <Context reloadable="false">
so that reloadable is explicitly set to false. It had absolutely no effect.
是否可以从其他任何地方设置可重新加载标志?我推测可能有其他文件或设置应用它,即使在 context.xml
中它被设置为 false
.
Is it possible to set the reloadable flag from anywhere else? I am speculating maybe some other file or setting applies it even if in context.xml
it is set to false
.
推荐答案
图像中的堆栈跟踪包含对 Loader#modified
并且只有在您设置 reloadable
您的上下文属性为 true
:
The stack trace in your images contains a call to Loader#modified
and is only possible if you set the reloadable
property of your context to true
:
<Context reloadable="true">
...
</Context>
如Tomcat 文档所述:
设置为 true
如果您希望 Catalina 监视 /WEB-INF/classes/
和 /WEB-INF/lib
中的类更改,并在检测到更改时自动重新加载 Web 应用程序.此功能在应用程序开发过程中非常有用,但它需要大量的运行时开销,不推荐用于已部署的生产应用程序.这就是为什么此属性的默认设置为 false.
(强调我的).
将 reloadable
设置为 false
(或删除属性)以消除开销.
Set reloadable
to false
(or delete the attribute) to get rid of the overhead.
这篇关于Tomcat 的 Catalina 实用程序线程会定期使用高 CPU 和内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!