问题描述
我试图理解为什么 saveAll 比保存在 Spring Data 存储库中的性能更好.我正在使用 CrudRepository
可以看到 这里.
I'm trying to understand why saveAll has better performance than save in the Spring Data repositories. I'm using CrudRepository
which can be seen here.
为了测试,我创建并添加了 10k 个实体,这些实体只有一个 id 和一个随机字符串(对于基准测试,我将字符串保持为常量)到一个列表中.遍历我的列表并在每个元素上调用 .save
需要 40 秒.对同一整个列表调用 .saveAll
只需 2 秒.使用 30k 个元素调用 .saveAll
需要 4 秒.在执行每个测试之前,我确保截断我的表.即使将 .saveAll
调用批处理到 50 个子列表也需要 10 秒,而 30k.
To test I created and added 10k entities, which just have an id and a random string (for the benchmark I kept the string a constant), to a list. Iterating over my list and calling .save
on each element, it took 40 seconds. Calling .saveAll
on the same entire list completed in 2 seconds. Calling .saveAll
with even 30k elements took 4 seconds. I made sure to truncate my table before performing each test. Even batching the .saveAll
calls to sublists of 50 took 10 seconds with 30k.
带有整个列表的简单 .saveAll
似乎是最快的.
The simple .saveAll
with the entire list seems to be the fastest.
我试图浏览 Spring Data 源代码,但 this 是我发现的唯一有价值的东西.这里似乎 .saveAll
只是简单地迭代整个 Iterable
并像我一样在每个人上调用 .save
.那么它是如何快得多的呢?它是否在内部进行一些事务批处理?
I tried to browse the Spring Data source code but this is the only thing I found of value. Here it seems .saveAll
simply iterates over the entire Iterable
and calls .save
on each one like I was doing. So how is it that much faster? Is it doing some transactional batching internally?
推荐答案
没有你的代码,我不得不猜测,我相信这与在 save
与在 saveAll
的情况下打开一笔交易.
Without having your code, I have to guess, I believe it has to do with the overhead of creating new transaction for each object saved in the case of save
versus opening one transaction in the case of saveAll
.
注意 save
和 saveAll
的定义,它们都用 @Transactional
注释.如果您的项目配置正确,这似乎是因为实体被保存到数据库中,这意味着每当调用这些方法之一时都会创建一个事务.如果您在循环中调用 save
,这意味着每次调用 save
时都会创建一个新事务,但在 saveAll
的情况下是一次调用,因此无论保存的实体数量如何,都会创建一个事务.
Notice the definition of save
and saveAll
they are both annotated with @Transactional
. If your project is configured properly, which seems to be the case since entities are being saved to the database, that means a transaction will be created whenever one of these methods are called. if you are calling save
in a loop that means a new transaction is being created each time you call save
, but in the case of saveAll
there is one call and therefor one transaction created regardless of the number of entities being saved.
我假设测试本身并没有在事务中运行,如果要在事务中运行,那么所有对保存的调用都将在该事务中运行,因为默认的事务传播是 Propagation.REQUIRED
,这意味着如果有一个事务已经打开,调用将在其中运行.如果您打算使用 spring 数据,我强烈建议您阅读 Spring 中的事务管理.
I'm assuming that the test is not itself being run within a transaction, if it were to be run within a transaction then all calls to save will run within that transaction since the the default transaction propagation is Propagation.REQUIRED
, that means if there is a transaction already open the calls will be run within it. If your planning to use spring data I strongly recommend that you read about transaction management in Spring.
这篇关于Spring数据保存与saveAll性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!