本文介绍了删除从Hadoop的文件/文件夹的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我运行电子病历活动内的数据管道分析日志文件,当我的管道失败出现以下错误
I'm running an EMR Activity inside a Data Pipeline analyzing log files and I get the following error when my Pipeline fails:
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://10.208.42.127:9000/home/hadoop/temp-output-s3copy already exists
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:121)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:944)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:905)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:905)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:879)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1316)
at com.valtira.datapipeline.stream.CloudFrontStreamLogProcessors.main(CloudFrontStreamLogProcessors.java:216)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
我怎么能删除该文件夹,从Hadoop的?
How can I delete that folder from Hadoop?
推荐答案
我联系AWS支持,它似乎,问题的日志文件,我分析是非常大的,并且与内存中创建一个问题。我加入到我的管道定义masterInstanceType:m1.xlarge在EMRCluster部分,它的工作
I contacted AWS support and it seemed that the problem was that the log files I was analyzing were very big and that created an issue with memory. I added to my pipeline definition "masterInstanceType" : "m1.xlarge" in the EMRCluster section and it worked.
这篇关于删除从Hadoop的文件/文件夹的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!