大量选择后，Spring批处理内存不足

本文介绍了大量选择后，Spring批处理内存不足的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的工作面临问题

我正在尝试从数据库读取记录并写入txt文件.该数据库包含1.800.000条记录(有149列)，问题是select在bean'mysqlItemReader'中的jobConfig.xml中，但是，我认为select尝试将所有记录加载到JVM内存中，然后我内存不足，使用randtb.cliente限制200000，它运行正常，但是我内存不足500k以上的记录，如何避免此错误?谢谢！

I'm trying to read records from database and write in a txt file. The database contains 1.800.000 records, with 149 columns, the problem is that the select is in the jobConfig.xml, in the bean 'mysqlItemReader', but, i think the select try to load all records in the JVM memory and then i got out of memory, using randtb.cliente limit 200000 it runs ok, but more than 500k of records i got out of memory, how avoid this error? Thanks!

<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:util="http://www.springframework.org/schema/util"
xsi:schemaLocation="http://www.springframework.org/schema/batch

  http://www.springframework.org/schema/batch/spring-batch-2.2.xsd
  http://www.springframework.org/schema/beans
  http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">

<import resource="Context.xml" />
<bean id="tutorial" class="extractor.main.Tutorial" scope="prototype" />
<bean id="itemProcessor" class="extractor.main.CustomItemProcessor" />

<batch:job id="helloWorldJob">
    <batch:step id="step1">
        <batch:tasklet>
            <batch:chunk reader="mysqlItemReader" writer="flatFileItemWriter"
                processor="itemProcessor" commit-interval="50">
            </batch:chunk>
        </batch:tasklet>
    </batch:step>
</batch:job>

<bean id="mysqlItemReader"
    class="org.springframework.batch.item.database.JdbcCursorItemReader">
    <property name="dataSource" ref="dataSource"/>
    <property name="sql" value="select * from randtb.cliente"/>
    <property name="rowMapper">
        <bean class="extractor.main.TutorialRowMapper"/>
    </property>
</bean>

<bean id="flatFileItemWriter" class=" org.springframework.batch.item.file.FlatFileItemWriter">
    <property name="resource" value="file:target/outputfiles/employee_output.txt" />
    <property name="lineAggregator">
        <bean
            class=" org.springframework.batch.item.file.transform.PassThroughLineAggregator" />
    </property>
</bean>

推荐答案

默认情况下，MySql将返回ResultSet中的所有内容，这会导致您的OOM异常.为了避免这种情况，您需要设置JdbcCursorItemReader#setFetchSize(Integer.MIN_VALUE).这将告诉Spring Batch在PreparedStatement上设置该值以及在PreparedStatement#setFetchDirection(ResultSet.FETCH_FORWARD)上进行设置.这将告诉MySql流数据，从而不会破坏堆栈.

By default, MySql will return everything in the ResultSet which is causing your OOM exception. In order for it not to do that, you need to set the JdbcCursorItemReader#setFetchSize(Integer.MIN_VALUE). This will tell Spring Batch to set that value on the PreparedStatement as well as setting PreparedStatement#setFetchDirection(ResultSet.FETCH_FORWARD). This will tell MySql to stream the data, thereby not blowing your stack.

因此对于您的特定示例，您需要将ItemReader配置更改为:

So for your specific example, you need to change your ItemReader configuration to be:

<bean id="mysqlItemReader"
    class="org.springframework.batch.item.database.JdbcCursorItemReader">
    <property name="dataSource" ref="dataSource"/>
    <property name="sql" value="select * from randtb.cliente"/>
    <property name="fetchSize" value="#{T(java.lang.Integer).MIN_VALUE}"/>
    <property name="rowMapper">
        <bean class="extractor.main.TutorialRowMapper"/>
    </property>
</bean>

您可以在他们的文档中阅读有关MySql如何工作的更多信息: https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html (请参阅ResultSet部分).

You can read more about how this works in MySql in their documentation here: https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html (see the ResultSet section).

这篇关于大量选择后，Spring批处理内存不足的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！