问题描述
有人可以帮助我了解YARN中JVM和容器之间的关系吗?
Can someone help me understand the relation between JVM and containers in YARN?
- 如何创建JVM,每个任务是一个JVM吗?多个任务可以同时在同一个JVM中运行吗? (我知道ubertasking,其中许多任务(映射/归约)可以在一个JVM中一个接一个地运行).
- 每个容器是一个JVM吗?或单个JVM中有多个容器?还是JVM和容器之间没有关系?
- 当资源管理器为一个作业分配容器时,同一作业内的多个任务是否对在同一节点中运行的任务使用相同的容器?还是根据可用性为每个任务使用单独的容器?
指向一些有用链接的指针也将有所帮助.
pointers to some useful links will also be helpful.
推荐答案
当然存在一个关系,它是一对一的.对于需要创建的每个容器,都会产生一个新的Java进程(JVM).
现在,如果您不是在超级模式下运行,请考虑以下事项:-
Of course there exists a relation and it's one-to-one. For each container that needs to be created, a new java process(JVM) is spawned.
Now, if you are not running in uber mode, consider following:-
请参阅,任务计划在群集中的某个节点上运行.根据任务的要求(内存和cpu),确定容器的容量.另外,您可以在下面的链接中找到一些特定的参数.
每个任务尝试都计划在JVM上进行.
See, tasks are scheduled to run on some node in the cluster. According to requirements(memory and cpu) of task, the capacity of a container is decided. Also there are certain parameters for this which you can find in links below.
Each task attempt is scheduled on a JVM.
根据群集中的资源可用性生成每个任务的单独容器.
以下是一些非常有用的链接-
http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html
https://blog.cloudera.com /blog/2015/09/untangling-apache-hadoop-yarn-part-1/
http://blog .cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-using-gotchas/
Separate containers for each task are spawned based on resource availability in the cluster.
Here are some links which very are helpful-
http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html
https://blog.cloudera.com/blog/2015/09/untangling-apache-hadoop-yarn-part-1/
http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-consuming-gotchas/
这篇关于纱:容器和JVM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!