停止自定义logback异步追加器的正确方法

停止自定义logback异步追加器的正确方法

本文介绍了停止自定义logback异步追加器的正确方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用Amazon的Java SDK创建了Amazon SQS和SNS logback附加程序.基本的附加程序使用同步Java API,但是我还通过扩展ch.qos.logback.classic.AsyncAppender类来创建了两者的异步版本.

I've created Amazon SQS and SNS logback appenders using the Amazon's Java SDK. The basic appenders use the synchronous Java APIs, but I've also created asynchronous versions of both by extending the ch.qos.logback.classic.AsyncAppender class.

使用异步附加程序停止logback logger上下文不能按预期方式工作.当上下文停止时,所有异步附加程序都尝试在退出之前刷新其余事件.该问题源自 ch.qos.logback.core.AsyncAppenderBase#stop 方法,该方法会中断工作线程.当Amazon SDK仍在处理排队的事件并触发com.amazonaws.AbortedException时,将触发中断.在我的测试中,AbortedException是在SDK处理来自API的响应时发生的,因此实际的消息通过了,但这并非总是如此.

Stopping the logback logger context with the async appenders does not work as expected though. When the context is stopped, all async appenders try to to flush remaining events before exiting. The problem originates from ch.qos.logback.core.AsyncAppenderBase#stop method, which interrupts the worker thread. The interrupt is triggered while the Amazon SDK is still processing the queued events and results a com.amazonaws.AbortedException. In my tests the AbortedException happened while the SDK was processing a response from the API, so the actual message went through, but this might not always be the case.

即使工作人员仍应处理剩余的事件队列,是否也要使logback中断工作线程?如果是这样,我该如何解决由中断引起的AbortedException?我可以重写整个stop方法并删除中断,但这将需要复制粘贴大部分实现.

Is it intended that logback interrupts the worker thread even though the workers should still process the remaining event queue? And if so, how can I work around the AbortedException caused by the interrupt? I could override the whole stop methods and remove the interrupt, but that would require copy pasting most of the implementation.

推荐答案

我终于设法找到了一个解决方案,我认为它不是最佳解决方案,远非简单,但它确实有效.

I finally managed to figure a solution, which I guess is not optimal and far from simple, but it's working.

我的第一个尝试是将AWS SDK API的异步版本与提供的logback提供的执行程序一起使用,因为使用内部执行程序,可以避免中断问题.但这并没有解决,因为工作队列是共享的,在这种情况下,该队列必须是特定于附加程序的,以允许正确停止它.因此,我需要在每个附加程序中使用自己的执行程序.

My first attempt was to use asynchronous versions of the AWS SDK APIs with the logback provided executor, because with internal executor, the interrupt problem could be avoided. But this didn't work out because the work queues are shared, and in this case the queue must be appender specific to allow stopping it correctly. So I needed to use own executor with each appender.

首先,我需要一个AWS客户端执行程序.与执行程序有关的问题是,提供的线程工厂必须创建守护程序线程,否则,如果使用了logback的JVM shutdown钩子,它将无限期地阻塞.

First I needed an executor for the AWS clients. The catch with the executor is that the provided thread factory must create daemon threads, otherwise it will block indefinitely if the logback's JVM shutdown hook is used.

public static ExecutorService newExecutor(Appender<?> appender, int threadPoolSize) {
    final String name = appender.getName();
    return Executors.newFixedThreadPool(threadPoolSize, new ThreadFactory() {

        private final AtomicInteger idx = new AtomicInteger(1);

        @Override
        public Thread newThread(Runnable r) {
            Thread thread = new Thread(r);
            thread.setName(name + "-" + idx.getAndIncrement());
            thread.setDaemon(true);
            return thread;
        }
    });
}

下一个问题是如何通过中断正确停止附加程序?这需要通过重试来处理中断的异常,因为执行程序否则会跳过等待队列刷新的操作.

The next issue was how to stop the appender correctly with the interrupt? This required handling interrupted exception with a retry, because the executor would otherwise skip waiting for the queue flush.

public static void shutdown(Appender<?> appender, ExecutorService executor, long waitMillis) {
    executor.shutdown();
    boolean completed = awaitTermination(appender, executor, waitMillis);
    if (!completed) {
        appender.addWarn(format("Executor for %s did not shut down in %d milliseconds, " +
                                "logging events might have been discarded",
                                appender.getName(), waitMillis));
    }
}

private static boolean awaitTermination(Appender<?> appender, ExecutorService executor, long waitMillis) {
    long started = System.currentTimeMillis();
    try {
        return executor.awaitTermination(waitMillis, TimeUnit.MILLISECONDS);
    } catch (InterruptedException ie1) {
        // the worker loop is stopped by interrupt, but the remaining queue should still be handled
        long waited = System.currentTimeMillis() - started;
        if (waited < waitMillis) {
            try {
                return executor.awaitTermination(waitMillis - waited, TimeUnit.MILLISECONDS);
            } catch (InterruptedException ie2) {
                appender.addError(format("Shut down of executor for %s was interrupted",
                                         appender.getName()));
            }
        }
        Thread.currentThread().interrupt();
    }
    return false;
}

正常的logback追加程序应以同步方式工作,因此即使没有适当的关闭钩子也不会丢失日志记录事件.当前的异步AWS开发工具包SDK调用存在此问题.我决定使用倒计时闩锁来提供阻止附加程序的行为.

The normal logback appenders are expected to work in syncronous manner and therefore shouldn't lose logging events even without a proper shutdown hook. This is a problem with the current async AWS SDK API calls. I decided to use countdown latch to provide a blocking appender behavior.

public class LoggingEventHandler<REQUEST extends AmazonWebServiceRequest, RESULT> implements AsyncHandler<REQUEST, RESULT> {

    private final ContextAware contextAware;
    private final CountDownLatch latch;
    private final String errorMessage;

    public LoggingEventHandler(ContextAware contextAware, CountDownLatch latch, String errorMessage) {
        this.contextAware = contextAware;
        this.latch = latch;
        this.errorMessage = errorMessage;
    }

    @Override
    public void onError(Exception exception) {
        contextAware.addWarn(errorMessage, exception);
        latch.countDown();
    }

    @Override
    public void onSuccess(REQUEST request, RESULT result) {
        latch.countDown();
    }
}

并处理与闩锁的等待.

public static void awaitLatch(Appender<?> appender, CountDownLatch latch, long waitMillis) {
    if (latch.getCount() > 0) {
        try {
            boolean completed = latch.await(waitMillis, TimeUnit.MILLISECONDS);
            if (!completed) {
                appender.addWarn(format("Appender '%s' did not complete sending event in %d milliseconds, " +
                                        "the event might have been lost",
                                        appender.getName(), waitMillis));
            }
        } catch (InterruptedException ex) {
            appender.addWarn(format("Appender '%s' was interrupted, " +
                                    "a logging event might have been lost or shutdown was initiated",
                                    appender.getName()));
            Thread.currentThread().interrupt();
        }
    }
}

然后将所有捆绑在一起.以下示例是实际实现的简化版本,仅显示了此问题的相关部分.

And then all bundled together. The following example is simplified version of the real implementation, just showing the relevant parts for this issue.

public class SqsAppender extends UnsynchronizedAppenderBase<ILoggingEvent> {

    private AmazonSQSAsyncClient sqs;

    @Override
    public void start() {
        sqs = new AmazonSQSAsyncClient(
                getCredentials(),
                getClientConfiguration(),
                Executors.newFixedThreadPool(getThreadPoolSize())
        );
        super.start();
    }

    @Override
    public void stop() {
        super.stop();
        if (sqs != null) {
            AppenderExecutors.shutdown(this, sqs.getExecutorService(), getMaxFlushTime());
            sqs.shutdown();
            sqs = null;
        }
    }

    @Override
    protected void append(final ILoggingEvent eventObject) {
        SendMessageRequest request = ...
        CountDownLatch latch = new CountDownLatch(1);
        sqs.sendMessageAsync(request, new LoggingEventHandler<SendMessageRequest, SendMessageResult>(this, latch, "Error"));
        AppenderExecutors.awaitLatch(this, latch, getMaxFlushTime());
    }
}

所有这些对于正确处理以下情况都是必需的:

All this was required to handle the following cases properly:

  • 使用异步附加程序包装器时,在logback上下文停止或关闭挂钩上刷新剩余事件队列
  • 使用logback的延迟关机钩子时不要无限期阻塞
  • 在不使用异步附加程序时提供阻止行为
  • 从异步附加程序停止的中断中幸存下来,这导致所有AWS开发工具包流实现均中断

以上内容在我维护的开源项目 Logback扩展中使用.

The above is used in the open source project Logback extensions, which I am maintainer of.

这篇关于停止自定义logback异步追加器的正确方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 19:43