本文介绍了从CSV文件中删除重复的行,而无需写入新文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是我现在的代码:
File file1 = new File("file1.csv");
File file2 = new File("file2.csv");
HashSet<String> f1 = new HashSet<>(FileUtils.readLines(file1));
HashSet<String> f2 = new HashSet<>(FileUtils.readLines(file2));
f2.removeAll(f1);
使用removeAll()
我从file1删除了file2中所有重复的文件,但是现在我想避免创建新的csv文件来优化该过程.只想从file2中删除重复的行.
With removeAll()
I remove all duplicates wich are in file2 from file1, but now I want to avoid to create a new csv file to optimize the process. Just want to delete from file2 the duplicate rows.
这是否可行,或者我必须创建一个新文件?
Is this possible or I have to create a new file?
推荐答案
好吧,当然可以,如果您不介意丢失文件,就可以这样做!
Well, sure, you can do that... If you don't mind possibly losing the file!
不要这样做.
由于您使用的是Java 7,因此使用java.nio.file .这是一个示例:
And since you use Java 7, well, use java.nio.file. Here's an example:
final Path file1 = Paths.get("file1.csv");
final Path file2 = Paths.get("file2.csv");
final Path tmpfile = file2.resolveSibling("file2.csv.new");
final Set<String> file1Lines
= new HashSet<>(Files.readAllLines(file1, StandardCharsets.UTF_8));
try (
final BufferedReader reader = Files.newBufferedReader(file2,
StandardCharsets.UTF_8);
final BufferedWriter writer = Files.newBufferedWriter(tmpfile,
StandardCharsets.UTF_8, StandardOpenOption.CREATE_NEW);
) {
String line;
while ((line = reader.readLine()) != null)
if (!file1Lines.contains(line)) {
writer.write(line);
writer.newLine();
}
}
try {
Files.move(tmpfile, file2, StandardCopyOption.REPLACE_EXISTING,
StandardCopyOption.ATOMIC_MOVE);
} catch (AtomicMoveNotSupportedException ignored) {
Files.move(tmpfile, file2, StandardCopyOption.REPLACE_EXISTING);
}
如果您使用Java 8,则可以改用以下try-with-resources块:
If you use Java 8, you can use this try-with-resources block instead:
try (
final Stream<String> stream = Files.lines(file2, StandardCharsets.UTF_8);
final BufferedWriter writer = Files.newBufferedWriter(tmpfile,
StandardCharsets.UTF_8, StandardOpenOption.CREATE_NEW);
) {
stream.filter(line -> !file1Lines.contains(line))
.forEach(line -> { writer.write(line); writer.newLine(); });
}
这篇关于从CSV文件中删除重复的行,而无需写入新文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!