问题描述
我正在对3个节点的hadoop集群运行配置单元查询。我收到一个错误,指出提取失败太多。我的配置单元查询是:
pre $ 插入覆盖表tablename1分区(namep)
选择id,名称,子字符串(名称, 5,2)作为tablename2的namep;
这就是我试图运行的查询。我想要做的就是将数据从tablename2传输到tablename1。任何帮助表示赞赏。
这可能是由各种hadoop配置问题引起的。这里有一对夫妇要特别注意:
- DNS问题:检查您的
/ etc / hosts $ c
- 在缩放器的映射器端没有足够的http线程
修复(从Cloudera故障排除)
- 设置
mapred.reduce.slowstart.completed.maps = 0.80 $ c $ b $ li $
tasktracker.http.threads = 80
-
mapred.reduce.parallel.copies = sqrt(节点数),但无论如何> = 10
更多详细信息,请参阅故障排除链接。
I'm running a hive query against a hadoop cluster of 3 nodes. And I am getting an error which says "Too many fetch failures". My hive query is:
insert overwrite table tablename1 partition(namep) select id,name,substring(name,5,2) as namep from tablename2;
that's the query im trying to run. All i want to do is transfer data from tablename2 to tablename1. Any help is appreciated.
解决方案This can be caused by various hadoop configuration issues. Here a couple to look for in particular:
- DNS issue : examine your
/etc/hosts
- Not enough http threads on the mapper side for the reducer
Some suggested fixes (from Cloudera troubleshooting)
- set
mapred.reduce.slowstart.completed.maps = 0.80
tasktracker.http.threads = 80
mapred.reduce.parallel.copies = sqrt (node count) but in any case >= 10
Here is link to troubleshooting for more details
http://www.slideshare.net/cloudera/hadoop-troubleshooting-101-kate-ting-cloudera
这篇关于“太多读取失败”同时使用Hive的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!