问题描述
我正在写信给hadoop文件系统。但每次追加内容时,它都会覆盖数据而不是将其添加到现有的数据/文件中。下面提供了这样做的代码。对于不同的数据,这个代码被一次又一次地调用。每次出现问题时都打开一个新的SequenceFile.Writer?每次我将路径作为新路径(someDir);
public void writeToHDFS(Path path,long uniqueId,String data){
FileSystem fs = path.getFileSystem(conf);
SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs,conf,
path,LongWritable.class,MyWritable.class);
inputWriter.append(new LongWritable(uniqueId ++),new MyWritable(data));
inputWriter.close();
目前没有办法通过API附加到现有的SequenceFile。当创建新的 SequenceFile.Writer
对象时,它不会追加到 Path
中的现有文件,而是覆盖它。查看我的。
正如Thomas指出的那样,如果您保留相同的 SequenceFile.Writer
对象,您将能够附加到该文件,直到您致电 close()
。
I am writing to hadoop file system. But everytime I append something, it overwrites the data instead of adding it to the existing data/file. The code which is doing this is provided below. This code is called again and again for different data. Is opening a new SequenceFile.Writer everytime a problem?
Each time I am getting the path as new Path("someDir");
public void writeToHDFS(Path path, long uniqueId, String data){
FileSystem fs = path.getFileSystem(conf);
SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs, conf,
path, LongWritable.class, MyWritable.class);
inputWriter.append(new LongWritable(uniqueId++), new MyWritable(data));
inputWriter.close();
}
There is currently no way to append to an existing SequenceFile through the API. When you make the new SequenceFile.Writer
object, it will not append to an existing file at that Path
, but instead overwrite it. See my earlier question.
As Thomas points out, if you keep the same SequenceFile.Writer
object, you will be able to append to the file until you call close()
.
这篇关于写入HDFS:文件被覆盖的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!