

希望大家都度过了一个美妙的假期。我想在Amazon EC2上安装Hadoop集群。而从本地磁盘中的数据文件复制到HDFS的命令 Hadoop的FS -copyFromLocal d.txt /用户/ Ubuntu的/数据,我得到的数据复制错误。从日志的错误是继

Hope you all had a wonderful vacation. I am trying to setup Hadoop cluster on Amazon EC2. While copying data file from local disk to hdfs with the command hadoop fs -copyFromLocal d.txt /user/ubuntu/data, I am getting data replication error. The error from the log is following

15/01/06七时40分36秒WARN hdfs.DFSClient:错误恢复为空坏的Datanode [0]节点== NULL

15/01/06七时40分36秒WARN hdfs.DFSClient:无法获取块位置。源文件/user/ubuntu/data/d.txt - >败... copyFromLocal:java.io.IOException异常:文件/user/ubuntu/data/d.txt只能被复制,而不是1到0节点,

15/01/06 07:40:36 WARN hdfs.DFSClient: Could not get block locations. Source file /user/ubuntu/data/d.txt" - > Aborting... copyFromLocal: java.io.IOException: File /user/ubuntu/data/d.txt could only be replicated to 0 nodes, instead of 1


15/01/06 07:40:36 ERROR hdfs.DFSClient: Failed to close file /user/ubuntu/data/d.txt

现在,我已经检查计算器和有关此问题的其他论坛,我发现他们大多讲的的DataNode 的TaskTracker 不运行作为一个可能的原因和放大器;相关的解决方案。但是,这些东西在我的设置中运行良好。该JPS命令的屏幕截图 http://i.imgur.com/vS6kRPP.png

Now, I had been checking StackOverFlow and other forums about this problem and I found most of them talk about DataNode, TaskTracker not running as a probable cause & relevant solutions. But these things are running fine in my setup. The screenshot of the JPS commandhttp://i.imgur.com/vS6kRPP.png


From HadooWiki, the other possible causes are DataNode not able talk to the server, through networking or Hadoop configuration problems or some configuration problem is preventing effective two-way communication.


I have configured hadoop-env.sh, core-site.xml, hdfs-site.xml and mapred-site.xml following the tutorial http://tinyurl.com/l2wv6y9 . Could anyone tell please me where I am going wrong ? I will be immensely grateful if anyone help me to resolve the problem.




Well, the problem was in security groups. When I've created the EC2 instances I created a new security group in which I haven't configured the rules for allowing ports to open for connection.


While creating a group with default options, we must add a rule for SSH at port 22. In order to have TCP and ICMP access we need to add 2 additional security rules. Add ‘All TCP’, ‘All ICMP’ and ‘SSH (22)’ under the inbound rules, This should work fine.


If we are using an existing security group, we should check the Inbound and outbound rules.


05-29 03:15