java - 如何在极小的群集(3个节点或更少)上附加到hdfs文件

我试图将其附加到单节点群集上的hdfs上的文件中。我也在2节点群集上尝试过，但是得到了相同的异常(exception)。

在hdfs-site中，我将dfs.replication设置为1。如果将dfs.client.block.write.replace-datanode-on-failure.policy设置为DEFAULT，则出现以下异常

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010], original=[10.10.37.16:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.

如果我遵循注释中针对极小集群(3个节点或更少)的configuration in hdfs-default.xml的建议，并将dfs.client.block.write.replace-datanode-on-failure.policy设置为NEVER，则会出现以下异常:

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot append to file/user/hadoop/test. Name node is in safe mode.
The reported blocks 1277 has reached the threshold 1.0000 of total blocks 1277. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 3 seconds.

这是我尝试附加的方法:

Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://MY-MACHINE:8020/user/hadoop");
conf.set("hadoop.job.ugi", "hadoop");

FileSystem fs = FileSystem.get(conf);
OutputStream out = fs.append(new Path("/user/hadoop/test"));

PrintWriter writer = new PrintWriter(out);
writer.print("hello world");
writer.close();

我在代码中做错什么了吗？
也许，配置中缺少一些东西？
任何帮助将不胜感激!

编辑

即使dfs.replication设置为1，当我通过以下方式检查文件的状态时

FileStatus[] status = fs.listStatus(new Path("/user/hadoop"));

我发现status[i].block_replication设置为3。我不认为这是问题所在，因为当我将dfs.replication的值更改为0时，我得到了一个相关的异常。因此，显然它确实服从dfs.replication的值，但出于安全考虑，是否有办法更改每个文件的block_replication值？

最佳答案

正如我在编辑中提到的。即使dfs.replication设置为1，fileStatus.block_replication设置为3。

一个可能的解决方案是运行

hadoop fs -setrep -w 1 -R /user/hadoop/

这将递归地更改给定目录中每个文件的复制因子。该命令的文档可以在here中找到。

现在要做的是查看为什么hdfs-site.xml中的值被忽略。以及如何将值1强制为默认值。

编辑

事实证明，也必须在dfs.replication实例中设置Configuration属性，否则，它要求文件的复制因子为默认值3，而不管在hdfs-site.xml中设置的值如何。

将以下语句添加到代码中即可解决该问题。

conf.set("dfs.replication", "1");

关于java - 如何在极小的群集(3个节点或更少)上附加到hdfs文件，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/24548699/