本文介绍了Hadoop:...被复制到 0 个节点而不是 minReplication (=1).有 1 个数据节点正在运行,此操作中未排除任何节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试将 HDFS 作为我的多线程应用程序的一部分写入时,我收到以下错误

I'm getting the following error when attempting to write to HDFS as part of my multi-threaded application

could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.

我在重新格式化方面尝试了最受好评的答案,但这对我不起作用:HDFS 错误:只能复制到 0 个节点,而不是 1 个

I've tried the top-rated answer here around reformatting but this doesn't work for me: HDFS error: could only be replicated to 0 nodes, instead of 1

这是怎么回事:

  1. 我的应用程序由 2 个线程组成,每个线程都配置了自己的 Spring Data PartitionTextFileWriter
  2. 线程 1 是第一个处理数据的线程,它可以成功写入 HDFS
  3. 但是,一旦线程 2 开始处理数据,当它尝试刷新到文件时就会出现此错误

线程 1 和 2 不会写入同一个文件,尽管它们确实在我的目录树的根目录共享一个父目录.

Thread 1 and 2 will not be writing to the same file, although they do share a parent directory at the root of my directory tree.

我的服务器上的磁盘空间没有问题.

There are no problems with disk space on my server.

我也在我的名称节点日志中看到了这一点,但不确定它是什么意思:

I also see this in my name-node logs, but not sure what it means:

2016-03-15 11:23:12,149 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
2016-03-15 11:23:12,150 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but only 0 storage types can be selected (replication=1, selected=[], unavailable=[DISK], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2016-03-15 11:23:12,150 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2016-03-15 11:23:12,151 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.104.247.78:52004 Call#61 Retry#0
java.io.IOException: File /metrics/abc/myfile could only be replicated to 0 nodes instead of [2016-03-15 13:34:16,663] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 1 milliseconds. (kafka.coordinator.GroupMetadataManager)

此错误的原因可能是什么?

What could be the cause of this error?

谢谢

推荐答案

此错误是由 HDFS 的块复制系统引起的,因为它无法管理聚焦文件中特定块的任何副本.常见原因:

This error is caused by the block replication system of HDFS since it could not manage to make any copies of a specific block within the focused file. Common reasons of that:

  1. 只有一个 NameNode 实例正在运行,并且未处于安全模式
  2. 没有 DataNode 实例启动并运行,或者一些实例已经死亡.(检查服务器)
  3. Namenode 和 Datanode 实例都在运行,但它们无法相互通信,这意味着 DataNode 和 NameNode 实例之间存在连接问题.
  4. 由于某些基于 hadoop 的网络问题,运行的 DataNode 实例无法与服务器通信(检查包含 datanode 信息的日志)
  5. DataNode 实例配置的数据目录中没有指定硬盘空间或DataNode 实例空间不足.(检查 dfs.data.dir//删除旧文件(如果有)
  6. dfs.datanode.du.reserved 中为DataNode 实例指定的预留空间大于可用空间,这使得DataNode 实例知道没有足够的可用空间.
  7. 没有足够的线程用于 DataNode 实例(检查 datanode 日志和 dfs.datanode.handler.count 值)
  8. 确保 dfs.data.transfer.protection 不等于authentication"并且 dfs.encrypt.data.transfer 等于 true.

还请:

  • 验证NameNode和DataNode服务的状态并查看相关日志
  • 验证 core-site.xml 是否具有正确的 fs.defaultFS 值并且 hdfs-site.xml 是否具有有效值.
  • 验证 hdfs-site.xml 具有 dfs.namenode.http-address.. 对于在 PHD HA 配置情况下指定的所有 NameNode 实例.
  • 验证目录的权限是否正确

参考:https://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo

参考:https://support.pivotal.io/hc/en-us/articles/201846688-HDFS-reports-Configured-Capacity-0-0-B-for-datanode

另外,请检查:从 Java 写入 HDFS,获得只能复制到 0 个节点而不是 minReplication"

这篇关于Hadoop:...被复制到 0 个节点而不是 minReplication (=1).有 1 个数据节点正在运行,此操作中未排除任何节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 10:05