问题描述
在 Rust 中,是否有一种惯用的方式来一次处理一个文件?
Is there an idiomatic way to process a file one character at a time in Rust?
这似乎是我所追求的:
let mut f = io::BufReader::new(try!(fs::File::open("input.txt")));
for c in f.chars() {
println!("Character: {}", c.unwrap());
}
但是Read::chars
从 Rust v1.6.0 开始仍然不稳定.
But Read::chars
is still unstable as of Rust v1.6.0.
我考虑过使用 Read::read_to_string
,但文件可能很大,我不想将其全部读入内存.
I considered using Read::read_to_string
, but the file may be large and I don't want to read it all into memory.
推荐答案
让我们比较 4 种方法.
Let's compare 4 approaches.
1.读取::字符
你可以复制阅读::chars
实现,但它被标记为不稳定的
You could copy Read::chars
implementation, but it is marked unstable with
错误发生位置的部分读/写的语义目前尚不清楚,可能会发生变化
所以必须小心.无论如何,这似乎是最好的方法.
so some care must be taken. Anyway, this seems to be the best approach.
2.flat_map
flat_map
替代方案无法编译:
use std::io::{BufRead, BufReader};
use std::fs::File;
pub fn main() {
let mut f = BufReader::new(File::open("input.txt").expect("open failed"));
for c in f.lines().flat_map(|l| l.expect("lines failed").chars()) {
println!("Character: {}", c);
}
}
问题是 chars
从字符串中借用,但 l.expect("lines failed")
只存在于闭包内,所以编译器给出了错误 借来的价值不够长
.
The problems is that chars
borrows from the string, but l.expect("lines failed")
lives only inside the closure, so compiler gives the error borrowed value does not live long enough
.
3.嵌套
此代码
use std::io::{BufRead, BufReader};
use std::fs::File;
pub fn main() {
let mut f = BufReader::new(File::open("input.txt").expect("open failed"));
for line in f.lines() {
for c in line.expect("lines failed").chars() {
println!("Character: {}", c);
}
}
}
有效,但它会为每一行分配一个字符串.此外,如果输入文件没有换行符,整个文件将被加载到内存中.
works, but it keeps allocation a string for each line. Besides, if there is no line break on the input file, the whole file would be load to the memory.
4.BufRead::read_until
方法 3 的内存高效替代方法是使用 Read::read_until
,并使用单个字符串读取每一行:
A memory efficient alternative to approach 3 is to use Read::read_until
, and use a single string to read each line:
use std::io::{BufRead, BufReader};
use std::fs::File;
pub fn main() {
let mut f = BufReader::new(File::open("input.txt").expect("open failed"));
let mut buf = Vec::<u8>::new();
while f.read_until(b'
', &mut buf).expect("read_until failed") != 0 {
// this moves the ownership of the read data to s
// there is no allocation
let s = String::from_utf8(buf).expect("from_utf8 failed");
for c in s.chars() {
println!("Character: {}", c);
}
// this returns the ownership of the read data to buf
// there is no allocation
buf = s.into_bytes();
buf.clear();
}
}
这篇关于在 Rust 中逐个字符地读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!