本文介绍了“太多读取失败”同时使用Hive的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对3个节点的hadoop集群运行配置单元查询。我收到一个错误,指出提取失败太多。我的配置单元查询是:

pre $ 插入覆盖表tablename1分区(namep)
选择id,名称,子字符串(名称, 5,2)作为tablename2的namep;

这就是我试图运行的查询。我想要做的就是将数据从tablename2传输到tablename1。任何帮助表示赞赏。

解决方案

这可能是由各种hadoop配置问题引起的。这里有一对夫妇要特别注意:


  • DNS问题:检查您的 / etc / hosts
  • 在缩放器的映射器端没有足够的http线程



  • 修复(从Cloudera故障排除)




    • 设置 mapred.reduce.slowstart.completed.maps = 0.80 tasktracker.http.threads = 80

    • mapred.reduce.parallel.copies = sqrt(节点数),但无论如何> = 10



    更多详细信息,请参阅故障排除链接。




    I'm running a hive query against a hadoop cluster of 3 nodes. And I am getting an error which says "Too many fetch failures". My hive query is:

      insert overwrite table tablename1 partition(namep)
      select id,name,substring(name,5,2) as namep from tablename2;
    

    that's the query im trying to run. All i want to do is transfer data from tablename2 to tablename1. Any help is appreciated.

    解决方案

    This can be caused by various hadoop configuration issues. Here a couple to look for in particular:

    • DNS issue : examine your /etc/hosts
    • Not enough http threads on the mapper side for the reducer

    Some suggested fixes (from Cloudera troubleshooting)

    • set mapred.reduce.slowstart.completed.maps = 0.80
    • tasktracker.http.threads = 80
    • mapred.reduce.parallel.copies = sqrt (node count) but in any case >= 10

    Here is link to troubleshooting for more details

    http://www.slideshare.net/cloudera/hadoop-troubleshooting-101-kate-ting-cloudera

    这篇关于“太多读取失败”同时使用Hive的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 05:44