1 BUG现象

系统并发请求,系统停滞无法使用,所有接口都是无法与后端进行交互的状态,系统并没有宕机

2 BUG的业务流程

  1. 插入分数方法 涉及插入表ABCD 加了声明式事务
  2. 查询分数方法 涉及表ABCD
controller() {
	@Transactional
	insertVo();
	selectById();
}

3 排查原因

因为代码不是我写的,一开始我就是怀疑是死锁导致的BUG,然后我用Jconsole,去检测一下死锁,并没有发现死锁,接下来我去Mysql看有没有死锁,结果也没有发现,然后我就懵了,jvm没有锁,mysql也没有锁且没有SQL在执行,为什么请求就会全注阻塞?

然后我去开始去看这个代码了,我发现他在控制层调用了两个业务层,通常我们只在控制层去做校验去调用一个service啊,然后我就继续看,insertVo插入了很多查询了很多,耗时3秒钟左右,selectById查询了一条SQL,这两个明面上的代码并没有什么加锁或什么飞天操作,想了半天搞不懂为什么。

然后我开始用排除法,把这些代码一一注释调试一下。我把insertVo注释掉,这个毋庸置疑,那只有一个简单的操作了,就查一表返回,这个绝对是没问题的,然后我把selectById注释掉,居然就好了?,selectById只有一条查询SQL啊也没有加锁,这个能解决但是肯定也不是这个原因。

然后我用druid监控到可使用连接数一直在占用,没有释放,我就去查druid配置,发现配置了

initial-size: 20 #初始大小
min-idle: 20 #最小空闲
max-active: 40 #最大链接
max-wait: 10000 #配置获取连接等待超时的时间

这个配置也没有毛病啊,没办法了我只能去看线程的具体信息了,查出来所有的线程池连接线程都是这样的

"pool-6-thread-10" #244 prio=5 os_prio=31 tid=0x00007fe94235f000 nid=0x22803 waiting on condition [0x00000003150ce000]
   java.lang.Thread.State: WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000006c109e090> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
	at com.alibaba.druid.pool.DruidDataSource.takeLast(DruidDataSource.java:2315)
	at com.alibaba.druid.pool.DruidDataSource.getConnectionInternal(DruidDataSource.java:1781)
	at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1494)
	at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:5058)
	at com.alibaba.druid.filter.stat.StatFilter.dataSource_getConnection(StatFilter.java:704)
	at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:5054)
	at com.alibaba.druid.filter.FilterAdapter.dataSource_getConnection(FilterAdapter.java:2759)
	at com.alibaba.druid.filter.FilterChainImpl.dataSource_connect(FilterChainImpl.java:5054)
	at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1469)
	at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1459)
	at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:83)
	at org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122)
	at org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:38)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:104)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:134)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getConnectionForTransactionManagement(LogicalConnectionManagedImpl.java:250)
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.begin(LogicalConnectionManagedImpl.java:258)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.begin(JdbcResourceLocalTransactionCoordinatorImpl.java:246)
	at org.hibernate.engine.transaction.internal.TransactionImpl.begin(TransactionImpl.java:83)
	at org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(HibernateJpaDialect.java:184)
	at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:402)
	at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:376)
	at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:572)
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:360)
	at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(TransactionInterceptor.java:99)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
	at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:95)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
	at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:747)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:689)
	at com.treach.platform.modules.service.impl.SysLogService$$EnhancerBySpringCGLIB$$36a85251.insert(<generated>)
	at com.treach.platform.log.factory.LogTaskFactory$2.run(LogTaskFactory.java:56)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

   Locked ownable synchronizers:
	- <0x00000007741f0d68> (a java.util.concurrent.ThreadPoolExecutor$Worker)

是的,连接池连接一直被占用锁住了,为什么会被锁住呢?,也设置了连接等待超时时间啊,然后我怀疑是配置没有生效,写了个代码看看

public static void main(String[] args) throws SQLException {
        ConfigurableApplicationContext run = SpringApplication.run(NdCyApplication.class, args);
        DruidDataSource bean = run.getBean(DruidDataSource.class);

        int maxActive = bean.getMaxActive();
        long maxWait = bean.getMaxWait();
        log.info("数据库线程池与数据库最大链接数" + String.valueOf(maxActive));
        log.info("数据库线程池等待链接数超时时间"+String.valueOf(maxWait));
    }

结果:

数据库线程池与数据库最大链接数8
数据库线程池等待链接数超时时间-1

这个和配置的不一样啊,真的没有生效,然后我又去查为什么没有生效,原来配置类里面有个DataSoure

@Bean     //声明其为Bean实例
@Primary  //在同样的DataSource中,首先使用被标注的DataSource
@ConfigurationProperties(prefix = "spring.datasource")
public DruidDataSource dataSource(){
        DruidDataSource datasource = new DruidDataSource();
        List<Filter> filters = new ArrayList<>();
        filters.add(wallFilter);
        filters.add(new StatFilter());
        datasource.setProxyFilters(filters);
        return datasource;
}

我们applcation.yaml的配置被覆盖了,druid默认等待链接数超时时间-1,难怪长时间占用连接没有超时。

4 解决

把等待连接超时时间等设置上

@Bean     //声明其为Bean实例
@Primary  //在同样的DataSource中,首先使用被标注的DataSource
@ConfigurationProperties(prefix = "spring.datasource")
public DruidDataSource dataSource(){
        DruidDataSource datasource = new DruidDataSource();
        datasource.setInitialSize(20);
        datasource.setMaxActive(80);
        datasource.setMaxWait(5000);
        List<Filter> filters = new ArrayList<>();
        filters.add(wallFilter);
        filters.add(new StatFilter());
        datasource.setProxyFilters(filters);
        return datasource;
}

成功

5 不懂的点

为什么会出现不回收线程的情况 按理来说现在没有SQL在执行,连接数不是会被回收吗 回到线程池 等待的线程就有连接了 就能不卡死了 为什么呢?

08-18 23:39