我正在使用Spring Batch(3.0.1.RELEASE)/JPA和HSQLBD服务器数据库。
我需要浏览整个表(使用分页)和更新项目(一个接一个)。因此,我使用了jpaPagingItemReader。但是当我运行作业时,我可以看到跳过了一些行,并且跳过的行数等于页面大小。例如,如果我的表有12行并且jpaPagingItemReader.pagesize = 3,则作业将读取:第1,2,3行,然后第7,8,9行(因此跳过第4,5,6行)…
您能告诉我我的代码/配置出了什么问题吗,还是HSQLDB分页有问题?
下面是我的代码:

[编辑] :问题出在我的ItemProcessor上,它对POJO实体进行了修改。由于JPAPagingItemReader在每次读取之间进行刷新,因此Entities会更新((这就是我想要的)。但是,似乎光标分页也增加了(如在日志中所见:行ID 4、5和6已被删除跳过)如何解决此问题?

@Configuration
@EnableBatchProcessing(modular=true)
public class AppBatchConfig {
  @Inject
  private InfrastructureConfiguration infrastructureConfiguration;
  @Inject private JobBuilderFactory jobs;
  @Inject private StepBuilderFactory steps;

  @Bean  public Job job() {
     return jobs.get("Myjob1").start(step1()).build();
  }
  @Bean  public Step step1() {
      return steps.get("step1")
                .<SNUserPerCampaign, SNUserPerCampaign> chunk(0)
                .reader(reader()).processor(processor()).build();
  }
  @Bean(destroyMethod = "")
@JobScope
public ItemStreamReader<SNUserPerCampaign> reader() String trigramme) {
    JpaPagingItemReader reader = new JpaPagingItemReader();
    reader.setEntityManagerFactory(infrastructureConfiguration.getEntityManagerFactory());
    reader.setQueryString("select t from SNUserPerCampaign t where t.isactive=true");
    reader.setPageSize(3));
    return reader;
}
 @Bean @JobScope
 public ItemProcessor<SNUserPerCampaign, SNUserPerCampaign> processor() {
     return new MyItemProcessor();
 }
}

@Configuration
@EnableBatchProcessing
public class StandaloneInfrastructureConfiguration implements InfrastructureConfiguration {
 @Inject private EntityManagerFactory emf;
 @Override
public EntityManagerFactory getEntityManagerFactory() {
    return emf;
}
}

从我的ItemProcessor中:
@Override
public SNUserPerCampaign process(SNUserPerCampaign item) throws Exception {
    //do some stuff …
   //then if (condition) update the Entity pojo :
   item.setModificationDate(new Timestamp(System.currentTimeMillis());
   item.setIsactive = false;

}

从Spring xml配置文件:
<tx:annotation-driven transaction-manager="transactionManager" />
<bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="dataSource" ref="dataSource" />
</bean>

<bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
    <property name="driverClassName" value="org.hsqldb.jdbcDriver" />
    <property name="url" value="jdbc:hsqldb:hsql://localhost:9001/MYAppDB" />
    <property name="username" value="sa" />
    <property name="password" value="" />
</bean>

跟踪/日志摘要:
11:16:05.728 TRACE MyItemProcessor - item processed: snUserInternalId=1]
11:16:06.038 TRACE MyItemProcessor - item processed: snUserInternalId=2]
11:16:06.350 TRACE MyItemProcessor - item processed: snUserInternalId=3]

11:16:06.674 DEBUG SQL- update SNUSER_CAMPAIGN  set ...etc...
11:16:06.677 DEBUG SQL- update SNUSER_CAMPAIGN  set ...etc...
11:16:06.679 DEBUG SQL- update SNUSER_CAMPAIGN  set ...etc...

11:16:06.681 DEBUG SQL- select ...etc... from  SNUSER_CAMPAIGN snuserperc0_

11:16:06.687 TRACE MyItemProcessor - item processed: snUserInternalId=7]
11:16:06.998 TRACE MyItemProcessor - item processed: snUserInternalId=8]
11:16:07.314 TRACE MyItemProcessor - item processed: snUserInternalId=9]

最佳答案

org.springframework.batch.item.database.JpaPagingItemReader创建的是自己的entityManager实例

(来自org.springframework.batch.item.database.JpaPagingItemReader#doOpen):

entityManager = entityManagerFactory.createEntityManager(jpaPropertyMap);

如果您正在事务中(看起来好像是事务),则不会分离阅读器实体
(来自org.springframework.batch.item.database.JpaPagingItemReader#doReadPage):
    if (!transacted) {
        List<T> queryResult = query.getResultList();
        for (T entity : queryResult) {
            entityManager.detach(entity);
            results.add(entity);
        }//end if
    } else {
        results.addAll(query.getResultList());
        tx.commit();
    }

因此,当您将某个项目更新为处理器或编写器时,该项目仍由读者的entityManager管理。

项目读取器读取下一个数据块时,会将上下文刷新到数据库。

因此,如果我们看一下您的情况,则在第一批数据处理之后,我们在数据库中:
|id|active
|1 | false
|2 | false
|3 | false

org.springframework.batch.item.database.JpaPagingItemReader使用限制和偏移量检索分页数据。因此,读者创建的下一个选择如下所示:
select * from table where active = true offset 3 limits 3.

读者会错过ID为4,5,6的项目,因为它们现在是数据库检索到的第一行。

作为解决方法,您可以使用jdbc实现(org.springframework.batch.item.database.JdbcPagingItemReader),因为它不使用限制和偏移量。它基于已排序的列(通常是id列),因此您不会丢失任何数据。
当然,您将必须将数据更新到编写器中(使用JPA或纯JDBC实现)

读者会更加冗长:
@Bean
public ItemReader<? extends Entity> reader() {
    JdbcPagingItemReader<Entity> reader = new JdbcPagingItemReader<Entity>();
    final SqlPagingQueryProviderFactoryBean sqlPagingQueryProviderFactoryBean = new SqlPagingQueryProviderFactoryBean();
    sqlPagingQueryProviderFactoryBean.setDataSource(dataSource);
    sqlPagingQueryProviderFactoryBean.setSelectClause("select *");
    sqlPagingQueryProviderFactoryBean.setFromClause("from <your table name>");
    sqlPagingQueryProviderFactoryBean.setWhereClause("where active = true");
    sqlPagingQueryProviderFactoryBean.setSortKey("id");
    try {
        reader.setQueryProvider(sqlPagingQueryProviderFactoryBean.getObject());
    } catch (Exception e) {
        e.printStackTrace();
    }
    reader.setDataSource(dataSource);
    reader.setPageSize(3);
    reader.setRowMapper(new BeanPropertyRowMapper<Entity>(Entity.class));
    return reader;

10-06 13:17