I am getting the following excpetion in my reducers:
EMFILE: Too many open files
at org.apache.hadoop.io.nativeio.NativeIO.open(Native Method)
at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:161)
at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:296)
at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:369)
at org.apache.hadoop.mapred.Child$4.run(Child.java:257)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
每个减速器正在创建大约 10,000 个文件.有没有办法设置每个盒子的ulimit.
Per reducer around 10,000 files are being created. Is there a way I can set the ulimit of each box.
我尝试使用以下命令作为引导脚本:ulimit -n 1000000
I tried using the following command as a bootstrap script:ulimit -n 1000000
我还在引导操作中尝试了以下操作来替换/usr/lib/hadoop/hadoop-daemon.sh 中的 ulimit 命令:
I also tried the following in bootstrap action to replace the ulimit command in /usr/lib/hadoop/hadoop-daemon.sh:
set -e -x
sudo sed -i -e "/^ulimit /s|.*|ulimit -n 134217728|" /usr/lib/hadoop/hadoop-daemon.sh
但即便如此,当我们登录主节点时,我仍然可以看到 ulimit -n 返回:32768.我还确认在/usr/lib/hadoop/hadoop-daemon.sh 中进行了所需的更改,它有:ulimit -n 134217728.
But even then when we log into master node I can see that ulimit -n returns : 32768.I also confirmed that there was the desired change made in /usr/lib/hadoop/hadoop-daemon.sh and it had : ulimit -n 134217728.
我们有这方面的任何 hadoop 配置吗?或者是否有解决方法?
Do we have any hadoop configurations for this?Or is there a workaround for this?
My main aim is to split out records into files according to the ids of each record, and there are 1.5 billion records right now which can certainly increase.
Any way to edit this file before this daemon is run on each slave?
好的,看来 Amazon EMR 设置中默认设置的 ulimit : 32768 已经太多了,如果有任何工作需要更多,那么应该重新审视他们的逻辑.因此,我没有将每个文件直接写入 s3,而是在本地写入它们,然后分批将 1024 个文件移至 s3.这解决了打开的文件太多
OK, so it seems that the ulimit set by default in Amazon EMR's setup : 32768 is already way too much and if any job needs more than this then one should revisit their logic.Hence, instead of writing every file directly to s3, I wrote them locally and moved to s3 in batches of 1024 files. This solved too many open files
也许当文件描述符被打开以写入 s3 时,不会像写入本地文件时那样被释放/关闭.欢迎对此提出更好的解释.
Perhaps when file descriptors were opened up for writing to s3 weren't getting released/closed as it would when written to local files. Any better explanation to this is welcome.
这篇关于EMR 中打开的文件太多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!