1 BUG现象
系统并发请求,系统停滞无法使用,所有接口都是无法与后端进行交互的状态,系统并没有宕机
2 BUG的业务流程
- 插入分数方法 涉及插入表ABCD 加了声明式事务
- 查询分数方法 涉及表ABCD
controller() {
@Transactional
insertVo();
selectById();
}
3 排查原因
因为代码不是我写的,一开始我就是怀疑是死锁导致的BUG,然后我用Jconsole,去检测一下死锁,并没有发现死锁,接下来我去Mysql看有没有死锁,结果也没有发现,然后我就懵了,jvm没有锁,mysql也没有锁且没有SQL在执行,为什么请求就会全注阻塞?
然后我去开始去看这个代码了,我发现他在控制层调用了两个业务层,通常我们只在控制层去做校验去调用一个service啊,然后我就继续看,insertVo插入了很多查询了很多,耗时3秒钟左右,selectById查询了一条SQL,这两个明面上的代码并没有什么加锁或什么飞天操作,想了半天搞不懂为什么。
然后我开始用排除法,把这些代码一一注释调试一下。我把insertVo注释掉,这个毋庸置疑,那只有一个简单的操作了,就查一表返回,这个绝对是没问题的,然后我把selectById注释掉,居然就好了?,selectById只有一条查询SQL啊也没有加锁,这个能解决但是肯定也不是这个原因。
然后我用druid监控到可使用连接数一直在占用,没有释放,我就去查druid配置,发现配置了
initial-size: 20 #初始大小
min-idle: 20 #最小空闲
max-active: 40 #最大链接
max-wait: 10000 #配置获取连接等待超时的时间
这个配置也没有毛病啊,没办法了我只能去看线程的具体信息了,查出来所有的线程池连接线程都是这样的
"pool-6-thread-10" #244 prio=5 os_prio=31 tid=0x00007fe94235f000 nid=0x22803 waiting on condition [0x00000003150ce000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000006c109e090> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at com.alibaba.druid.pool.DruidDataSource.takeLast(DruidDataSource.java:2315)
at com.alibaba.druid.pool.DruidDataSource.getConnectionInternal(DruidDataSource.java:1781)
at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1494)
at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:5058)
at com.alibaba.druid.filter.stat.StatFilter.dataSource_getConnection(StatFilter.java:704)
at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:5054)
at com.alibaba.druid.filter.FilterAdapter.dataSource_getConnection(FilterAdapter.java:2759)
at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:5054)
at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1469)
at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1459)
at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:83)
at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122)
at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:38)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:104)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:134)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement(LogicalConnectionManagedImpl.java:250)
at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin(LogicalConnectionManagedImpl.java:258)
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.begin(JdbcResourceLocalTransactionCoordinatorImpl.java:246)
at org.hibernate.engine.transaction.internal.TransactionImpl.begin(TransactionImpl.java:83)
at org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(HibernateJpaDialect.java:184)
at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:402)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:376)
at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:572)
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:360)
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:99)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
at com.treach.platform.modules.service.impl.SysLogService$$EnhancerBySpringCGLIB$$36a85251.insert(<generated>)
at com.treach.platform.log.factory.LogTaskFactory$2.run(LogTaskFactory.java:56)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Locked ownable synchronizers:
- <0x00000007741f0d68> (a java.util.concurrent.ThreadPoolExecutor$Worker)
是的,连接池连接一直被占用锁住了,为什么会被锁住呢?,也设置了连接等待超时时间啊,然后我怀疑是配置没有生效,写了个代码看看
public static void main(String[] args) throws SQLException {
ConfigurableApplicationContext run = SpringApplication.run(NdCyApplication.class, args);
DruidDataSource bean = run.getBean(DruidDataSource.class);
int maxActive = bean.getMaxActive();
long maxWait = bean.getMaxWait();
log.info("数据库线程池与数据库最大链接数" + String.valueOf(maxActive));
log.info("数据库线程池等待链接数超时时间"+String.valueOf(maxWait));
}
结果:
数据库线程池与数据库最大链接数8
数据库线程池等待链接数超时时间-1
这个和配置的不一样啊,真的没有生效,然后我又去查为什么没有生效,原来配置类里面有个DataSoure
@Bean //声明其为Bean实例
@Primary //在同样的DataSource中,首先使用被标注的DataSource
@ConfigurationProperties(prefix = "spring.datasource")
public DruidDataSource dataSource(){
DruidDataSource datasource = new DruidDataSource();
List<Filter> filters = new ArrayList<>();
filters.add(wallFilter);
filters.add(new StatFilter());
datasource.setProxyFilters(filters);
return datasource;
}
我们applcation.yaml的配置被覆盖了,druid默认等待链接数超时时间-1,难怪长时间占用连接没有超时。
4 解决
把等待连接超时时间等设置上
@Bean //声明其为Bean实例
@Primary //在同样的DataSource中,首先使用被标注的DataSource
@ConfigurationProperties(prefix = "spring.datasource")
public DruidDataSource dataSource(){
DruidDataSource datasource = new DruidDataSource();
datasource.setInitialSize(20);
datasource.setMaxActive(80);
datasource.setMaxWait(5000);
List<Filter> filters = new ArrayList<>();
filters.add(wallFilter);
filters.add(new StatFilter());
datasource.setProxyFilters(filters);
return datasource;
}
成功
5 不懂的点
为什么会出现不回收线程的情况 按理来说现在没有SQL在执行,连接数不是会被回收吗 回到线程池 等待的线程就有连接了 就能不卡死了 为什么呢?