我正在使用Spring Batch(3.0.1.RELEASE)/JPA和HSQLBD服务器数据库。
我需要浏览整个表(使用分页)和更新项目(一个接一个)。因此,我使用了jpaPagingItemReader。但是当我运行作业时,我可以看到跳过了一些行,并且跳过的行数等于页面大小。例如,如果我的表有12行并且jpaPagingItemReader.pagesize = 3,则作业将读取:第1,2,3行,然后第7,8,9行(因此跳过第4,5,6行)…
您能告诉我我的代码/配置出了什么问题吗,还是HSQLDB分页有问题?
下面是我的代码:
[编辑] :问题出在我的ItemProcessor上,它对POJO实体进行了修改。由于JPAPagingItemReader在每次读取之间进行刷新,因此Entities会更新((这就是我想要的)。但是,似乎光标分页也增加了(如在日志中所见:行ID 4、5和6已被删除跳过)如何解决此问题?
@Configuration
@EnableBatchProcessing(modular=true)
public class AppBatchConfig {
@Inject
private InfrastructureConfiguration infrastructureConfiguration;
@Inject private JobBuilderFactory jobs;
@Inject private StepBuilderFactory steps;
@Bean public Job job() {
return jobs.get("Myjob1").start(step1()).build();
}
@Bean public Step step1() {
return steps.get("step1")
.<SNUserPerCampaign, SNUserPerCampaign> chunk(0)
.reader(reader()).processor(processor()).build();
}
@Bean(destroyMethod = "")
@JobScope
public ItemStreamReader<SNUserPerCampaign> reader() String trigramme) {
JpaPagingItemReader reader = new JpaPagingItemReader();
reader.setEntityManagerFactory(infrastructureConfiguration.getEntityManagerFactory());
reader.setQueryString("select t from SNUserPerCampaign t where t.isactive=true");
reader.setPageSize(3));
return reader;
}
@Bean @JobScope
public ItemProcessor<SNUserPerCampaign, SNUserPerCampaign> processor() {
return new MyItemProcessor();
}
}
@Configuration
@EnableBatchProcessing
public class StandaloneInfrastructureConfiguration implements InfrastructureConfiguration {
@Inject private EntityManagerFactory emf;
@Override
public EntityManagerFactory getEntityManagerFactory() {
return emf;
}
}
从我的ItemProcessor中:
@Override
public SNUserPerCampaign process(SNUserPerCampaign item) throws Exception {
//do some stuff …
//then if (condition) update the Entity pojo :
item.setModificationDate(new Timestamp(System.currentTimeMillis());
item.setIsactive = false;
}
从Spring xml配置文件:
<tx:annotation-driven transaction-manager="transactionManager" />
<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="dataSource" ref="dataSource" />
</bean>
<bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
<property name="driverClassName" value="org.hsqldb.jdbcDriver" />
<property name="url" value="jdbc:hsqldb:hsql://localhost:9001/MYAppDB" />
<property name="username" value="sa" />
<property name="password" value="" />
</bean>
跟踪/日志摘要:
11:16:05.728 TRACE MyItemProcessor - item processed: snUserInternalId=1]
11:16:06.038 TRACE MyItemProcessor - item processed: snUserInternalId=2]
11:16:06.350 TRACE MyItemProcessor - item processed: snUserInternalId=3]
11:16:06.674 DEBUG SQL- update SNUSER_CAMPAIGN set ...etc...
11:16:06.677 DEBUG SQL- update SNUSER_CAMPAIGN set ...etc...
11:16:06.679 DEBUG SQL- update SNUSER_CAMPAIGN set ...etc...
11:16:06.681 DEBUG SQL- select ...etc... from SNUSER_CAMPAIGN snuserperc0_
11:16:06.687 TRACE MyItemProcessor - item processed: snUserInternalId=7]
11:16:06.998 TRACE MyItemProcessor - item processed: snUserInternalId=8]
11:16:07.314 TRACE MyItemProcessor - item processed: snUserInternalId=9]
最佳答案
org.springframework.batch.item.database.JpaPagingItemReader创建的是自己的entityManager实例
(来自org.springframework.batch.item.database.JpaPagingItemReader#doOpen):
entityManager = entityManagerFactory.createEntityManager(jpaPropertyMap);
如果您正在事务中(看起来好像是事务),则不会分离阅读器实体
(来自org.springframework.batch.item.database.JpaPagingItemReader#doReadPage):
if (!transacted) {
List<T> queryResult = query.getResultList();
for (T entity : queryResult) {
entityManager.detach(entity);
results.add(entity);
}//end if
} else {
results.addAll(query.getResultList());
tx.commit();
}
因此,当您将某个项目更新为处理器或编写器时,该项目仍由读者的entityManager管理。
项目读取器读取下一个数据块时,会将上下文刷新到数据库。
因此,如果我们看一下您的情况,则在第一批数据处理之后,我们在数据库中:
|id|active
|1 | false
|2 | false
|3 | false
org.springframework.batch.item.database.JpaPagingItemReader使用限制和偏移量检索分页数据。因此,读者创建的下一个选择如下所示:
select * from table where active = true offset 3 limits 3.
读者会错过ID为4,5,6的项目,因为它们现在是数据库检索到的第一行。
作为解决方法,您可以使用jdbc实现(org.springframework.batch.item.database.JdbcPagingItemReader),因为它不使用限制和偏移量。它基于已排序的列(通常是id列),因此您不会丢失任何数据。
当然,您将必须将数据更新到编写器中(使用JPA或纯JDBC实现)
读者会更加冗长:
@Bean
public ItemReader<? extends Entity> reader() {
JdbcPagingItemReader<Entity> reader = new JdbcPagingItemReader<Entity>();
final SqlPagingQueryProviderFactoryBean sqlPagingQueryProviderFactoryBean = new SqlPagingQueryProviderFactoryBean();
sqlPagingQueryProviderFactoryBean.setDataSource(dataSource);
sqlPagingQueryProviderFactoryBean.setSelectClause("select *");
sqlPagingQueryProviderFactoryBean.setFromClause("from <your table name>");
sqlPagingQueryProviderFactoryBean.setWhereClause("where active = true");
sqlPagingQueryProviderFactoryBean.setSortKey("id");
try {
reader.setQueryProvider(sqlPagingQueryProviderFactoryBean.getObject());
} catch (Exception e) {
e.printStackTrace();
}
reader.setDataSource(dataSource);
reader.setPageSize(3);
reader.setRowMapper(new BeanPropertyRowMapper<Entity>(Entity.class));
return reader;