问题描述
我听说有一种方法可以在Hadoop 2.7纱线中添加32个核心,或者将核心添加到1个容器中。
这是可能的吗?有一个示例配置,我需要更改以实现此目的?
测试将是terasort,将我的40个内核添加到1个容器作业中。
对于vCore,以下是配置:
$ b
yarn.scheduler.maximum- allocation-vcores - 为每个容器请求指定vCore的最大分配量
通常在 yarn-site.xml 中设置此值值为32.我认为,任何大于32的值都会被YARN拒绝。
< property>
< name> yarn.scheduler.maximum-allocation-vcores< / name>
<值> 32< /值>
< / property>
如果未设置此值,则YARN RM采用默认值4 / p>
public static final int DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES = 4;
如果您正在运行MapReduce应用程序,则还需要另外设置两个配置参数, strong> mapred-site.xml :
- mapreduce.map.cpu.vcores -
- mapreduce.reduce.cpu.vcores - 要从调度程序请求减少任务的vCore数量
您的mapper / reducer请求的资源计算是在调度程序代码中完成的。如果您希望您的调度程序考虑内存和CPU以进行资源计算,那么您需要使用DominantResourceCalculator(考虑CPU和内存以进行资源计算)
例如如果您使用Capacity Scheduler,那么您需要在 capacity-scheduler.xml 文件中指定以下参数:
< property>
< name> yarn.scheduler.capacity.resource-calculator< / name>
< value> org.apache.hadoop.yarn.util.resource.DominantResourceCalculator< / value>
< / property>
请检查以下链接:
详细介绍各种配置参数。
I hear there is a way to add 32 cores or which ever you have for cores to 1 container in Hadoop 2.7 yarn.
Would this be possible and does someone have a sample configuration of what I need to change to achieve this?
The test would be terasort, adding my 40 cores to 1 container job.
For vCores following are the configurations:
yarn.scheduler.maximum-allocation-vcores - Specifies maximum allocation of vCores for every container request
Typically in yarn-site.xml, you set this value to 32. I think, any value greater than 32 will be rejected by YARN.
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>32</value>
</property>
If this value is not set, then YARN RM takes the default value, which is "4"
public static final int DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES = 4;
If you are running a MapReduce application, then you also need to set two more configuration parameters, in mapred-site.xml:
- mapreduce.map.cpu.vcores - The number of vCores to request from the scheduler for map tasks
- mapreduce.reduce.cpu.vcores - The number of vCores to request from the scheduler for the reduce tasks
The resource calculation for your mapper/reducer requests is done in the scheduler code. If you want your scheduler to consider both memory and CPUs for resource calculation, then you need to use "DominantResourceCalculator" (which considers both CPU and memory for resource calculation)
For e.g. if you are using Capacity Scheduler, then you need to specify following parameter in "capacity-scheduler.xml" file:
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>
Please check this link: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_ig_yarn_tuning.html
This gives a detailed description of various configuration parameters.
这篇关于Hadoop在Hadoop 2.7上添加每个容器多于1个核心的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!