问题描述
我是flink的新手,并试图理解:
I'm new to flink and try to understand:
- 工作
- 任务
- 子任务
我搜索了文档,但仍然没有得到.它们之间的主要区别是什么?
I searched in the docs but still did not get it. What's the main diffence between them?
推荐答案
此处解释了任务和子任务- https://ci.apache.org/projects/flink/flink-docs-release-1.7/concepts/runtime .html#tasks-and-operator-chains :
Tasks and sub-tasks are explained here -- https://ci.apache.org/projects/flink/flink-docs-release-1.7/concepts/runtime.html#tasks-and-operator-chains:
任务是一种抽象概念,代表可以在单个线程中执行的一系列操作符.诸如keyBy(导致网络改组通过某个密钥对流进行划分)之类的事情,或者管线并行性的变化都会破坏链接,并迫使运营商分担不同的任务.在上图中,该应用程序具有三个任务.
A task is an abstraction representing a chain of operators that could be executed in a single thread. Something like a keyBy (which causes a network shuffle to partition the stream by some key) or a change in the parallelism of the pipeline will break the chaining and force operators into separate tasks. In the diagram above, the application has three tasks.
子任务是任务的一个并行切片.这是可调度的,可运行的执行单元.在上图中,源/地图和keyBy/Window/apply任务的应用程序并行度为2,接收器的并行度为2,因此将运行该应用程序.
A subtask is one parallel slice of a task. This is the schedulable, runable unit of execution. In the diagram above, the application is to be run with a parallelism of two for the source/map and keyBy/Window/apply tasks, and a parallelism of one for the sink -- resulting in a total of 5 subtasks.
作业是应用程序的运行实例.客户将作业提交给作业经理,该作业将其切成子任务,并安排这些子任务供任务经理执行.
A job is a running instance of an application. Clients submit jobs to the jobmanager, which slices them into subtasks and schedules those subtasks for execution by the taskmanagers.
更新:
社区决定重新调整任务和子任务的定义,以匹配这些术语在代码中的使用方式-这意味着任务和子任务现在具有相同的含义:一个对象的一个并行实例运营商或运营商链.有关更多详细信息,请参见词汇表.
The community decided to re-align the definitions of task and sub-task to match how these terms are used in the code -- which means that task and sub-task now mean the same thing: exactly one parallel instance of an operator or operator chain. See the Glossary for more details.
这篇关于flink中的作业,任务和子任务之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!