问题描述
这是一个有据可查的错误,修复很容易,但是有谁知道为什么 Hadoop 数据节点 NamespaceID 会如此容易搞砸,或者 Hadoop 在启动数据节点时如何分配 NamespaceID?
This is a fairly well-documented error and the fix is easy, but does anyone know why Hadoop datanode NamespaceIDs can get screwed up so easily or how Hadoop assigns the NamespaceIDs when it starts up the datanodes?
错误如下:
2010-08-06 12:12:06,900 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /Users/jchen/Data/Hadoop/dfs/data: namenode namespaceID = 773619367; datanode namespaceID = 2049079249
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:216)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)
这似乎甚至发生在单节点实例上.
This seems to even happen for single node instances.
推荐答案
Namenode 每次格式化 HDFS 时都会生成新的 namespaceID.我认为这可能是为了区分当前版本和以前的版本.如果某些内容不正确,您总是可以回滚到以前的版本,如果 namespaceID 对于每个格式化的实例都不是唯一的,那么这可能是不可能的.
Namenode generates new namespaceID every time you format HDFS. I think this is possibly to differentiate current version and previous version. You can always rollback to previous version if something is not proper which may not be possible if namespaceID is not unique for every formatted instance.
NamespaceID 还连接名称节点和数据节点.Datanodes通过namespaceID绑定到namenode
NamespaceID also connects namenode and datanodes. Datanodes bind themselves to namenode through namespaceID
这篇关于为什么会发生 Hadoop 不兼容的 namespaceIDs 问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!