问题描述
我运行的配置单元查询运行良好的小数据集。但我运行了250万条记录,我得到了日志中的错误
FATAL org.apache.hadoop.mapred.Child:运行child时出错:java.lang.OutOfMemoryError:无法在java.lang.Thread.start0创建新的本地线程
(本地方法)$ b $ java.util.Thread.start(Thread.java:640 )
at org.apache.hadoop.mapred.Task $ TaskReporter.startCommunicationThread(Task.java:725)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
at org.apache.hadoop.mapred.Child $ 4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth。 Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.hadoop.mapred.Child.main( Child.java:249)
2013-03-18 14:12:58,907 WARN org.apache.hadoop.mapred.Child:运行子
时出错java.io.IOException:不能运行程序ln:java.io.IOException:error = 11,资源暂时不可用
在java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
在java.lang.Runtime.exec (Runtime.java:593)在java.lang.Runtime.exec处
(Runtime.java:431)在java.lang.Runtime.exec处
(Runtime.java:369)
at org.apache.hadoop.fs.FileUtil.symLink(FileUtil.java:567)
在org.apache.hadoop.mapred.TaskRunner.symlink(TaskRunner.java:787)
在org.apache。 hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:752)
在org.apache.hadoop.mapred.Child.main(Child.java:225)
引起:java.io.IOException: java.io.IOException:error = 11,Resource暂不可用
在java.lang.UNIXProcess。< init>(UNIXProcess.java:148)
在java.lang.ProcessImpl.start(ProcessImpl。 java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
... 7 more
2013-03-18 14:12:58,911 INFO org.apache .hadoop.mapred.Task:Runnning clean为任务
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Child:清除
的错误java.lang.NullPointerException $ b $ org.apache。 hadoop.mapred.Task.taskCleanup(Task.java:1048)
at org.apache.hadoop.mapred.Child.main(Child.java:281)
需要帮助。
谢谢大家。 。 你是对的。这是因为文件描述符,因为我的程序在目标表中生成了很多文件。由于分区结构的多级别。
我增加了ulimit和xceivers属性。它确实有帮助。但仍然在我们的情况下,这些限制也被越过了。然后我们决定按照分区分配数据,然后我们只为每个分区获取一个文件。
它对我们有效。我们将系统规模扩大至500亿条记录,并为我们工作
I have running hive query which running fine for small dataset. but i am running for 250 million records i have getting below errors in logs
FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:640)
at org.apache.hadoop.mapred.Task$TaskReporter.startCommunicationThread(Task.java:725)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
2013-03-18 14:12:58,907 WARN org.apache.hadoop.mapred.Child: Error running child
java.io.IOException: Cannot run program "ln": java.io.IOException: error=11, Resource temporarily unavailable
at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
at java.lang.Runtime.exec(Runtime.java:593)
at java.lang.Runtime.exec(Runtime.java:431)
at java.lang.Runtime.exec(Runtime.java:369)
at org.apache.hadoop.fs.FileUtil.symLink(FileUtil.java:567)
at org.apache.hadoop.mapred.TaskRunner.symlink(TaskRunner.java:787)
at org.apache.hadoop.mapred.TaskRunner.setupWorkDir(TaskRunner.java:752)
at org.apache.hadoop.mapred.Child.main(Child.java:225)
Caused by: java.io.IOException: java.io.IOException: error=11, Resource temporarily unavailable
at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
at java.lang.ProcessImpl.start(ProcessImpl.java:65)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
... 7 more
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task
2013-03-18 14:12:58,911 INFO org.apache.hadoop.mapred.Child: Error cleaning up
java.lang.NullPointerException
at org.apache.hadoop.mapred.Task.taskCleanup(Task.java:1048)
at org.apache.hadoop.mapred.Child.main(Child.java:281)
need help on this.
Thank you all.. You are correct. it is because of the file descriptor, as my program was generating lot of file in target table. due to multilevel of partition structure.
I have increased the ulimit and also xceivers property. it did helped. but still in our situation those limits were also crossed
Then we decided to distribute data as per the partitions and then we are getting only one file per partition.
It worked for us. We scaled our system to 50+billion records and it worked for us
这篇关于java.lang.OutOfMemoryError:无法为大数据集创建新的本地线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!