问题描述
如果我的hadoop集群中的块复制为3,并且每个DataNode都有3个$ {dfs.data.dir}目录。当DataNode被选择为存储块时,该块是存储在所有3个direcoties中还是其中的一个?
如果答案是后者,如何选择$ {dfs .data.dir}目录?
当数据块到达datanode时,以循环方式选择正确的目录。您可以通过将dfs.datanode.fsdataset.volume.choosing.policy更改为org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy来更改此行为,然后根据它们中可用的空间选择正确的目录(请参阅此处的配置:)
If the block replication is 3 in my hadoop cluster,and every DataNode has 3 ${dfs.data.dir} directories. When the DataNode is choosed to storage block, the block is storage in all 3 direcoties or one of them?
If the answer is latter, how to choose a ${dfs.data.dir} directory?
The right directory is chosen on round robin manner when the block arrives to the datanode. You can alter this behavior by changing dfs.datanode.fsdataset.volume.choosing.policy to org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy, then the right directory would be chosen based on the space available in them (refer to configurations here: https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml)
这篇关于如何在DataNode中选择块放置策略?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!