问题描述
映射程序任务的输出何时从本地文件系统中删除?他们坚持到整个工作完成或者他们在早些时候被删除吗?
解决方案映射程序任务的输出何时从本地文件系统中删除?他们坚持到整个工作完成或者他们在早些时候被删除吗?
解决方案除了地图并减少任务,还会创建两个进一步的任务:作业设置任务
和作业清理任务。这些由tasktrackers运行,用于在任何map任务运行之前运行代码以设置
作业,并在所有reduce任务完成后进行清理。
为作业配置的OutputCommitter确定要运行的代码,默认情况下为
,这是一个FileOutputCommitter。对于作业设置任务,它将创建作业的最终
输出目录和任务输出的临时工作空间,
作业清理任务将删除任务输出的临时工作空间。
查看
When do the outputs for a mapper task get deleted from the local filesystem? Do they persist until the entire job completes or do they get deleted at an earlier time than that?
In addition to the map and reduce tasks, two further tasks are created: a job setup taskand a job cleanup task. These are run by tasktrackers and are used to run code to setupthe job before any map tasks run, and to cleanup after all the reduce tasks are complete.The OutputCommitter that is configured for the job determines the code to be run, andby default this is a FileOutputCommitter. For the job setup task it will create the finaloutput directory for the job and the temporary working space for the task output, andfor the job cleanup task it will delete the temporary working space for the task output.
Have a look at OutputCommitter.
这篇关于从磁盘删除映射器任务的结果何时被删除?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!