initializeEntitiesAndCollections

initializeEntitiesAndCollections

本文介绍了OutOfMemoryError作为多次搜索的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个经典的Java EE系统,具有JSF的Web层,用于BL的EJB 3,以及对DB2数据库进行数据访问的Hibernate 3。我正在努力解决以下情况:用户将启动一个涉及从数据库中检索大型数据集的过程。检索过程需要一些时间,因此用户不会立即收到响应,不耐烦,打开新的浏览器,并重新启动检索,有时会多次。 EJB容器显然不知道第一个检索不再相关,当数据库返回结果集时,Hibernate开始填充一组占用大量内存的POJO,最终导致$ code > OutOfMemoryError 。



我想到的一个潜在的解决方案是使用Hibernate Session的 cancelQuery 方法。但是, cancelQuery 方法仅在之前运行,数据库返回一个结果集。一旦数据库返回结果集并且Hibernate开始填充POJO,则 cancelQuery 方法不再具有效果。在这种情况下,数据库查询本身返回的速度相当快,大部分性能开销似乎都驻留在填充POJO中,此时我们不能再调用code> cancelQuery 方法。

解决方案

解决方案最终如下所示:



一般的想法是将当前正在运行查询的所有Hibernate会话的映射保存到发起它们的用户的HttpSession中,以便当用户关闭浏览器时,我们将能够终止运行的查询。 p>

这里有两个主要挑战要克服。一个是将HTTP会话ID从Web层传播到EJB层,而不会干扰所有的方法调用,即不会篡改系统中的现有代码。第二个挑战是找出一旦数据库已经开始返回结果并且Hibernate正在填充对象的结果,如何取消查询。



第一个问题是基于在我们的认识上,沿堆栈调用的所有方法都被同一个线程处理。这是有道理的,因为我们的应用程序存在于一个容器中,。在这种情况下,我们创建了一个Servlet过滤器,它拦截对应用程序的每个调用,并添加一个 ThreadLocal 变量和当前的HTTP session-id。这样,HTTP会话ID将可用于沿着该方向调用的每个方法调用。



第二个挑战是更粘一点。我们发现,负责运行查询并随后填充POJO的Hibernate方法称为 doQuery ,位于 org.hibernate.loader.Loader中。 java 类。 (我们碰巧使用Hibernate 3.5.3,但对于较新版本的Hibernate也是如此)。

 私人列表doQuery(
final SessionImplementor session,
final QueryParameters queryParameters,
final boolean returnProxies)throws SQLException,HibernateException {

final RowSelection selection = queryParameters.getRowSelection();
final int maxRows = hasMaxRows(selection)?
selection.getMaxRows()。intValue():
Integer.MAX_VALUE;

final int entitySpan = getEntityPersisters()。length;

final ArrayList hydratedObjects = entitySpan == 0? null:new ArrayList(entitySpan * 10);
final PreparedStatement st = prepareQueryStatement(queryParameters,false,session);
final ResultSet rs = getResultSet(st,queryParameters.hasAutoDiscoverScalarTypes(),queryParameters.isCallable(),selection,session);

final EntityKey optionalObjectKey = getOptionalObjectKey(queryParameters,session);
final LockMode [] lockModesArray = getLockModes(queryParameters.getLockOptions());
final boolean createSubselects = isSubselectLoadingEnabled();
final列表subselectResultKeys = createSubselects? new ArrayList():null;
final List results = new ArrayList();

try {

handleEmptyCollections(queryParameters.getCollectionKeys(),rs,session);

EntityKey [] keys = new EntityKey [entitySpan]; //我们可以为每一行重用它

if(log.isTraceEnabled())log.trace(processing result set);

int count;
for(count = 0; count< maxRows&& rs.next(); count ++){

if(log.isTraceEnabled())log.debug(result set行:+ count);

Object result = getRowFromResultSet(
rs,
session,
queryParameters,
lockModesArray,
optionalObjectKey,
hydratedObjects,
键,
returnProxies
);
results.add(result);

if(createSubselects){
subselectResultKeys.add(keys);
keys = new EntityKey [entitySpan]; //在这种情况下不能重用
}

}

if(log.isTraceEnabled()){
log.trace(done处理结果集(+ count +行));
}

}
finally {
session.getBatcher()。closeQueryStatement(st,rs);
}

initializeEntitiesAndCollections(hydratedObjects,rs,session,queryParameters.isReadOnly(session));

if(createSubselects)createSubselects(subselectResultKeys,queryParameters,session);

返回结果; // getResultList(results);

}

在此方法中,您可以看到,首先将结果带入从数据库中以一个好的老式 java.sql.ResultSet 的形式,之后它在每个集合的循环中运行,并从中创建一个对象。在循环后调用的 initializeEntitiesAndCollections()方法中执行一些额外的初始化。调试一下之后,我们发现大部分的性能开销都在这个方法的这些部分,而不是部分获取 java.sql.ResultSet 数据库,但 cancelQuery 方法仅在第一部分有效。因此,解决方案是为for循环添加一个附加条件,以检查线程是否被中断:

  for count = 0; count< maxRows&& rs.next()&&!currentThread.isInterrupted(); count ++){
// ...
}

以及在调用 initializeEntitiesAndCollections()方法:

  if(!Thread.interrupted()){

initializeEntitiesAndCollections(hydratedObjects,rs ,session,
queryParameters.isReadOnly(session));
if(createSubselects){

createSubselects(subselectResultKeys,queryParameters,session);
}
}

此外,通过调用 Thread.interrupted()在第二次检查时,标志被清除,不会影响程序的进一步运行。现在当一个查询被取消时,取消方法访问Hibernate会话和存储在一个以HTTP session-id为关键字的地图中的线程,调用 cancelQuery 方法会话并调用线程的中断方法。


I have a classic Java EE system, Web tier with JSF, EJB 3 for the BL, and Hibernate 3 doing the data access to a DB2 database. I am struggling with the following scenario: A user will initiate a process which involves retrieving a large data set from the database. The retrieval process takes some time and so the user does not receive an immediate response, gets impatient and opens a new browser and initiates the retrieval again, sometimes multiple times. The EJB container is obviously unaware of the fact that the first retrievals are no longer relevant, and when the database returns a result set, Hibernate starts populating a set of POJOs which take up vast amounts of memory, eventually causing an OutOfMemoryError.

A potential solution that I thought of was to use the Hibernate Session's cancelQuery method. However, the cancelQuery method only works before the database returns a result set. Once the database returns a result set and Hibernate begins populating the POJOs, the cancelQuery method no longer has an effect. In this case, the database queries themselves return rather quickly, and the bulk of the performance overhead seems to reside in populating the POJOs, at which point we can no longer call the cancelQuery method.

解决方案

The solution implemented ended up looking like this:

The general idea was to maintain a map of all the Hibernate sessions that are currently running queries to the HttpSession of the user who initiated them, so that when the user would close the browser we would be able to kill the running queries.

There were two main challenges to overcome here. One was propagating the HTTP session-id from the web tier to the EJB tier without interfering with all the method calls along the way - i.e. not tampering with existing code in the system. The second challenge was to figure out how to cancel the queries once the database had already started returning results and Hibernate was populating objects with the results.

The first problem was overcome based on our realization that all methods being called along the stack were being handled by the same thread. This makes sense, as our application exists all within one container and does not have any remote calls. Being that that is the case, we created a Servlet Filter that intercepts every call to the application and adds a ThreadLocal variable with the current HTTP session-id. This way the HTTP session-id will be available to each one of the method calls lower down along the line.

The second challenge was a little more sticky. We discovered that the Hibernate method responsible for running the queries and subsequently populating the POJOs was called doQuery and located in the org.hibernate.loader.Loader.java class. (We happen to be using Hibernate 3.5.3, but the same holds true for newer versions of Hibernate.):

private List doQuery(
        final SessionImplementor session,
        final QueryParameters queryParameters,
        final boolean returnProxies) throws SQLException, HibernateException {

    final RowSelection selection = queryParameters.getRowSelection();
    final int maxRows = hasMaxRows( selection ) ?
            selection.getMaxRows().intValue() :
            Integer.MAX_VALUE;

    final int entitySpan = getEntityPersisters().length;

    final ArrayList hydratedObjects = entitySpan == 0 ? null : new ArrayList( entitySpan * 10 );
    final PreparedStatement st = prepareQueryStatement( queryParameters, false, session );
    final ResultSet rs = getResultSet( st, queryParameters.hasAutoDiscoverScalarTypes(), queryParameters.isCallable(), selection, session );

    final EntityKey optionalObjectKey = getOptionalObjectKey( queryParameters, session );
    final LockMode[] lockModesArray = getLockModes( queryParameters.getLockOptions() );
    final boolean createSubselects = isSubselectLoadingEnabled();
    final List subselectResultKeys = createSubselects ? new ArrayList() : null;
    final List results = new ArrayList();

    try {

        handleEmptyCollections( queryParameters.getCollectionKeys(), rs, session );

        EntityKey[] keys = new EntityKey[entitySpan]; //we can reuse it for each row

        if ( log.isTraceEnabled() ) log.trace( "processing result set" );

        int count;
        for ( count = 0; count < maxRows && rs.next(); count++ ) {

            if ( log.isTraceEnabled() ) log.debug("result set row: " + count);

            Object result = getRowFromResultSet(
                    rs,
                    session,
                    queryParameters,
                    lockModesArray,
                    optionalObjectKey,
                    hydratedObjects,
                    keys,
                    returnProxies
            );
            results.add( result );

            if ( createSubselects ) {
                subselectResultKeys.add(keys);
                keys = new EntityKey[entitySpan]; //can't reuse in this case
            }

        }

        if ( log.isTraceEnabled() ) {
            log.trace( "done processing result set (" + count + " rows)" );
        }

    }
    finally {
        session.getBatcher().closeQueryStatement( st, rs );
    }

    initializeEntitiesAndCollections( hydratedObjects, rs, session, queryParameters.isReadOnly( session ) );

    if ( createSubselects ) createSubselects( subselectResultKeys, queryParameters, session );

    return results; //getResultList(results);

}

In this method you can see that first the results are brought from the database in the form of a good old fashioned java.sql.ResultSet, after which it runs in a loop over each set and creates an object from it. Some additional initialization is performed in the initializeEntitiesAndCollections() method called after the loop. After debugging a little, we discovered that the bulk of the performance overhead was in these sections of the method, and not in the part that gets the java.sql.ResultSet from the database, but the cancelQuery method was only effective on the first part. The solution therefore was to add an additional condition to the for loop, to check whether the thread is interrupted like this:

for ( count = 0; count < maxRows && rs.next() && !currentThread.isInterrupted(); count++ ) {
// ...
}

as well as to perform the same check before calling the initializeEntitiesAndCollections() method:

if (!Thread.interrupted()) {

    initializeEntitiesAndCollections(hydratedObjects, rs, session,
                queryParameters.isReadOnly(session));
    if (createSubselects) {

        createSubselects(subselectResultKeys, queryParameters, session);
    }
}

Additionally, by calling the Thread.interrupted() on the second check, the flag is cleared and does not affect the further functioning of the program. Now when a query is to be canceled, the canceling method accesses the Hibernate session and thread stored in a map with the HTTP session-id as the key, calls the cancelQuery method on the session and calls the interrupt method of the thread.

这篇关于OutOfMemoryError作为多次搜索的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 22:33