本文介绍了尝试升级到Flink 1.3.1时发生异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试将群集中的flink版本升级到1.3.1(以及1.3.2),并且在任务管理器中遇到以下异常:

I tried to upgrade my flink version in my cluster to 1.3.1 (and 1.3.2 as well) and I got the following exception in my task managers:

2018-02-28 12:57:27,120 ERROR org.apache.flink.streaming.runtime.tasks.StreamTask           - Error during disposal of stream operator.
org.apache.kafka.common.KafkaException: java.lang.InterruptedException
        at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:424)
        at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerBase.close(FlinkKafkaProducerBase.java:317)
        at org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43)
        at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:126)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:429)
        at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:334)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.InterruptedException
        at java.lang.Object.wait(Native Method)
        at java.lang.Thread.join(Thread.java:1252)
        at java.lang.Thread.join(Thread.java:1326)
        at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:422)
        ... 7 more

作业管理器显示它无法与任务管理器建立连接.

The job manager showed that it failed to connect with the task managers.

我正在使用 FlinkKafkaProducer08 .有什么想法吗?

I am using FlinkKafkaProducer08.Any ideas?

推荐答案

首先,从上面的堆栈跟踪:它是在操作员清理非优美终止时抛出的(否则,此代码将不会执行).看起来好像应该跟随引起最初问题的实际异常.您可以提供日志的更多部分吗?

First of all, from the stack trace above: it was thrown during operator cleanup of a non-graceful termination (otherwise this code is not executed). It looks as if it should be followed by the real exception that caused the initial problem. Can you provide some more parts of the log?

如果JobManager无法连接到应运行您的作业的任何TaskManager,则整个作业将被取消(并根据您的重试策略重试).在TaskManager端可能会发生相同的情况.这可能是根本原因,需要进一步调查.

If the JobManager failed to connect to any TaskManager that should run your job, the whole job will be cancelled (and retried based on your retry policy). The same may happen on your TaskManager side. That may be the root cause and needs further investigation.

这篇关于尝试升级到Flink 1.3.1时发生异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 16:45