本文介绍了java.lang.OutOfMemoryError:无法为大数据集创建新的本地线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我运行的配置单元查询运行良好的小数据集。但我运行了250万条记录,我得到了日志中的错误

  FATAL org.apache.hadoop.mapred.Child:运行child时出错:java.lang.OutOfMemoryError:无法在java.lang.Thread.start0创建新的本地线程
(本地方法)$ b $ java.util.Thread.start(Thread.java:640 )
at org.apache.hadoop.mapred.Task $ TaskReporter.startCommunicationThread(Task.java:725)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
at org.apache.hadoop.mapred.Child $ 4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth。 Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.hadoop.mapred.Child.main( Child.java:249)



2013-03-18 14:12:58,907 WARN org.apache.hadoop.mapred.Child:运行子
时出错java.io.IOException:不能运行程序ln:java.io.IOException:error = 11,资源暂时不可用
在java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
在java.lang.Runtime.exec (Runtime.java:593)在java.lang.Runtime.exec处
(Runtime.java:431)在java.lang.Runtime.exec处
(Runtime.java:369)
at org.apache.hadoop.fs.FileUtil.symLink(FileUtil.java:567)
在org.apache.hadoop.mapred.TaskRunner.symlink(TaskRunner.java:787)
在org.apache。 hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:752)
在org.apache.hadoop.mapred.Child.main(Child.java:225)
引起:java.io.IOException: java.io.IOException:error = 11,Resource暂不可用
在java.lang.UNIXProcess。< init>(UNIXProcess.java:148)
在java.lang.ProcessImpl.start(ProcessImpl。 java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
... 7 more
2013-03-18 14:12:58,911 INFO org.apache .hadoop.mapred.Task:Runnning clean为任务
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Child:清除
的错误java.lang.NullPointerException $ b $ org.apache。 hadoop.mapred.Task.taskCleanup(Task.java:1048)
at org.apache.hadoop.mapred.Child.main(Child.java:281)

需要帮助。

解决方案

谢谢大家。 。 你是对的。这是因为文件描述符,因为我的程序在目标表中生成了很多文件。由于分区结构的多级别。



我增加了ulimit和xceivers属性。它确实有帮助。但仍然在我们的情况下,这些限制也被越过了。然后我们决定按照分区分配数据,然后我们只为每个分区获取一个文件。



它对我们有效。我们将系统规模扩大至500亿条记录,并为我们工作

I have running hive query which running fine for small dataset. but i am running for 250 million records i have getting below errors in logs

 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError:   unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:640)
    at org.apache.hadoop.mapred.Task$TaskReporter.startCommunicationThread(Task.java:725)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)



 2013-03-18 14:12:58,907 WARN org.apache.hadoop.mapred.Child: Error running child
 java.io.IOException: Cannot run program "ln": java.io.IOException: error=11, Resource temporarily unavailable
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at java.lang.Runtime.exec(Runtime.java:593)
    at java.lang.Runtime.exec(Runtime.java:431)
    at java.lang.Runtime.exec(Runtime.java:369)
    at org.apache.hadoop.fs.FileUtil.symLink(FileUtil.java:567)
    at org.apache.hadoop.mapred.TaskRunner.symlink(TaskRunner.java:787)
    at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:752)
    at org.apache.hadoop.mapred.Child.main(Child.java:225)
 Caused by: java.io.IOException: java.io.IOException: error=11, Resource temporarily unavailable
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
    at java.lang.ProcessImpl.start(ProcessImpl.java:65)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
    ... 7 more
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Child: Error cleaning up
  java.lang.NullPointerException
    at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:1048)
    at org.apache.hadoop.mapred.Child.main(Child.java:281)

need help on this.

解决方案

Thank you all.. You are correct. it is because of the file descriptor, as my program was generating lot of file in target table. due to multilevel of partition structure.

I have increased the ulimit and also xceivers property. it did helped. but still in our situation those limits were also crossed

Then we decided to distribute data as per the partitions and then we are getting only one file per partition.

It worked for us. We scaled our system to 50+billion records and it worked for us

这篇关于java.lang.OutOfMemoryError:无法为大数据集创建新的本地线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 05:06
查看更多