问题描述
下面的代码涉及到一个非常微妙的借用检查器闪避.代码本身描述了推理.问题:
- 这真的安全吗?
- 这是表示执行的不安全操作的推荐方式吗?我应该改用指针吗?
- 新的 Polonius 借用检查器能否对这样的模式进行推理?
////在给定的键上插入一个新的数据元素.pub fn insert<'a, K: Eq, V>(this: &'a mut Vec <(K, V)>, key: K, val: V) ->&'a mut V {//安全性:如下所示,我们希望在循环中直接返回 val1,//但是 rust 会拒绝这个,要求双重借用,我们改为使用 some//绕过借用检查器的不安全黑客.为了证明这是安全的,请考虑//两种情况.//- 如果执行返回(我们找到了一个元素并提前退出)://- 让 'b 成为 'a(借用 self),//- 让 'c 为空//- 否则(我们没有找到元素,退出循环并正常终止)://- 让 'b 是循环的持续时间,//- 并让 'c 从循环结束直到 'a 结束//在任何一种情况下,'b 和 'c 是不相交的,所以双借"是安全的.//借用检查器的原因是 'b 必须至少是 'a,因为它被返回了,//因此它与 'c 重叠,但它们是互斥的//情况.for (key1, val1) in &/* 'b */mut *this {如果键 == *key1 {//返回 val1;//我们想写这个return unsafe {//安全,见上文.我们知道我们处于第一种情况,所以 'b = 'astd::mem::transmute::<&/* 'b */mut V, &/* 'a */mut V>(val1)}}}让这个 = &/* 'c */mut *this;this.push((key, val));&mut this.last_mut().unwrap().1}
这是我更喜欢写的:
////在给定的键上插入一个新的数据元素.pub fn insert<'a, K: Eq, V>(this: &'a mut Vec <(K, V)>, key: K, val: V) ->&'a mut V {for (key1, val1) 在 &mut *this {如果键 == *key1 {返回 val1;}}让这个 = &mut *this;this.push((key, val));&mut this.last_mut().unwrap().1}
但它失败了:
error[E0499]: 不能一次多次借用 `*this` 作为可变变量-->src/lib.rs:8:16|2 |pub fn insert<'a, K: Eq, V>(this: &'a mut Vec <(K, V)>, key: K, val: V) ->&'a mut V {|-- 此处定义的生命周期 `'a`3 |for (key1, val1) 在 &mut *this {|---------- 第一个可变借用发生在这里4 |如果键 == *key1 {5 |返回 val1;|---- 返回这个值需要`*this`被借用给`'a`...8 |让这个 = &mut *this;|^^^^^^^^^^^ 第二个可变借位发生在这里
如相关问题所述,最简单的方法是使用索引,因为它不需要不安全代码.我可能会这样写:
pub fn insert(this: &'a mut Vec, key: K, val: V) ->&'a mut V {让 idx = 这个.iter().枚举().find_map(|(i, (k, _))| if key == *k { Some(i) } else { None });让 idx = idx.unwrap_or_else(|| {this.push((key, val));this.len() - 1});&mut this[idx].1}
您应该执行基准测试以了解这是否由于某种原因不够快.只有在这种情况下,您才应该选择unsafe
代码以获得最后一点速度.然后,您应该再次进行基准测试,看看代码是否明显更快.
例如,您可以使用 slice::get_unchecked_mut
而不是 &mut this[idx].1
,这是一个更容易合理化的不安全代码.>
在我们的安全代码中使用索引的好处是它们可以直接转换为指针偏移逻辑.我们可以拿这个安全的例子,对它做最小的修改,以获得一个使用 unsafe
代码的版本:
pub fn insert(this: &'a mut Vec, key: K, val: V) ->&'a mut V {//我从 Stack Overflow 复制了这段代码,没有阅读周围的内容//解释为什么此代码安全或不安全的文本.不安全{让发现=这个.iter_mut().find_map(|(k, v)| if key == *k { Some(v as *mut V) } else { None });找到匹配{一些(v) =>&mut *v,无 =>{this.push((key, val));&mut this.last_mut().unwrap().1}}}}
安全的要点围绕着found
中的值的指针展开.它开始是一个可变引用,所以我们知道它在我们迭代时是有效的.我们知道 find_map
在第一个 Some
上停止迭代,并且我们知道使用 iter_mut()
进行迭代无论如何都不应该改变我们的值.由于this
不能在found
的绑定和match
的使用之间改变,我相信这段代码是安全的.
通过 Miri 练习您的代码总是很有价值的.您实际上必须执行代码,因为 Miri 仅标记导致未定义行为的代码,而忽略任何休眠的代码路径.此代码是 Miri-clean:
fn main() {让 mut things = vec![(1, 2), (3, 4)];让 v = insert(&mut things, 1, 2);println!("{} ({:p})", v, v);让 v = insert(&mut things, 1, 2);println!("{} ({:p})", v, v);让 v = insert(&mut things, 5, 6);println!("{} ({:p})", v, v);让 v = insert(&mut things, 5, 6);println!("{} ({:p})", v, v);}
2 (0x2829c)2 (0x2829c)6 (0x41054)6 (0x41054)
[原始实现]真的安全吗?
Miri 报告我上面使用的相同测试代码没有问题,而且我没有发现任何明显的错误.
这是表示执行的不安全操作的推荐方式吗?我应该改用指针吗?
当可以避免 mem::transmute
时,通常应该避免.transmute
是大锤子,可以做很多你可能不打算做的事情(改变类型是一个关键).在这种情况下,使用指针感觉更简单.
我同意使用注释来说明为什么不安全代码是安全的.即使是错误的,它仍然展示了原作者的心态.未来的审阅者可能会说啊,他们没有考虑检查清单第 42 项,让我测试一下!".
特别是对于您评论中的推理,它过于密集/学术对我来说.我不明白为什么会有人谈论多重生命周期或双重借用.
新的 Polonius 借用检查器能否对这样的模式进行推理?
是:
% cargo +nightly rustc --编译示例 v0.1.0 (/private/tmp/example)错误 [E0499]: 不能一次多次借用 `*this` 作为可变的-->src/main.rs:8:16|2 |pub fn insert<'a, K: Eq, V>(this: &'a mut Vec <(K, V)>, key: K, val: V) ->&'a mut V {|-- 此处定义的生命周期 `'a`3 |for (key1, val1) 在 &mut *this {|---------- 第一个可变借用发生在这里4 |如果键 == *key1 {5 |返回 val1;|---- 返回这个值需要`*this`被借用给`'a`...8 |让这个 = &mut *this;|^^^^^^^^^^ 第二个可变借位发生在这里% 货物 + 每晚 rustc ---Zpolonius编译示例 v0.1.0 (/private/tmp/example)在 0.86 秒内完成开发 [未优化 + 调试信息] 目标% ./目标/调试/示例2 (0x7f97ea405b24)2 (0x7f97ea405b24)6 (0x7f97ea405ba4)6 (0x7f97ea405ba4)
另见:
The following code involves a very subtle bit of borrow checker dodging. The code itself describes the reasoning. The questions:
- Is this actually safe?
- Is this the recommended way to express the unsafe operations performed? Should I use pointers instead?
- Will the new Polonius borrow checker be able to reason about patterns like this?
/// Insert a new data element at a given key.
pub fn insert<'a, K: Eq, V>(this: &'a mut Vec<(K, V)>, key: K, val: V) -> &'a mut V {
// Safety: As indicated below, we would like to return val1 directly in the loop,
// but rust will reject this, claiming a double borrow, and we instead use some
// unsafe hacks to circumvent the borrow checker. To show this is safe, consider
// two cases.
// - If the return is exercised (we found an element and early out):
// - let 'b be 'a (the borrow of self),
// - and let 'c be empty
// - Otherwise (we did not find an element, exit the loop and terminate normally):
// - let 'b be the duration of the loop,
// - and let 'c be from the end of the loop until the end of 'a
// In either case, 'b and 'c are disjoint, so the "double borrow" is safe.
// The borrow checker reasons that 'b has to be at least 'a because it is returned,
// and therefore it overlaps with 'c, but these happen in mutually exclusive
// situations.
for (key1, val1) in & /* 'b */ mut *this {
if key == *key1 {
// return val1; // we would like to write this
return unsafe { // safety, see above. We know we are in the first case, so 'b = 'a
std::mem::transmute::<&/* 'b */ mut V, &/* 'a */ mut V>(val1)
}
}
}
let this = & /* 'c */ mut *this;
this.push((key, val));
&mut this.last_mut().unwrap().1
}
This is what I'd prefer to write:
/// Insert a new data element at a given key.
pub fn insert<'a, K: Eq, V>(this: &'a mut Vec<(K, V)>, key: K, val: V) -> &'a mut V {
for (key1, val1) in &mut *this {
if key == *key1 {
return val1;
}
}
let this = &mut *this;
this.push((key, val));
&mut this.last_mut().unwrap().1
}
but it fails:
error[E0499]: cannot borrow `*this` as mutable more than once at a time
--> src/lib.rs:8:16
|
2 | pub fn insert<'a, K: Eq, V>(this: &'a mut Vec<(K, V)>, key: K, val: V) -> &'a mut V {
| -- lifetime `'a` defined here
3 | for (key1, val1) in &mut *this {
| ---------- first mutable borrow occurs here
4 | if key == *key1 {
5 | return val1;
| ---- returning this value requires that `*this` is borrowed for `'a`
...
8 | let this = &mut *this;
| ^^^^^^^^^^ second mutable borrow occurs here
As stated in the related questions, the easiest thing to do is to use an index instead as it requires no unsafe code. I might write it like this:
pub fn insert<'a, K: Eq, V>(this: &'a mut Vec<(K, V)>, key: K, val: V) -> &'a mut V {
let idx = this
.iter()
.enumerate()
.find_map(|(i, (k, _))| if key == *k { Some(i) } else { None });
let idx = idx.unwrap_or_else(|| {
this.push((key, val));
this.len() - 1
});
&mut this[idx].1
}
You should perform benchmarking to know if this is not fast enough for some reason. Only in that case should you opt in to unsafe
code to get the last bit of speed. You should then benchmark again to see if the code is measurably faster.
For example, you might be able to get the speedup by using slice::get_unchecked_mut
instead of &mut this[idx].1
, which is a much easier bit of unsafe code to rationalize.
The nice thing about using indices in our safe code is that they directly translate into pointer offset logic. We can take this safe example and make minimal modifications to it to get a version using unsafe
code:
pub fn insert<'a, K: Eq, V>(this: &'a mut Vec<(K, V)>, key: K, val: V) -> &'a mut V {
// I copied this code from Stack Overflow without reading the surrounding
// text which explained why this code is or is not safe.
unsafe {
let found = this
.iter_mut()
.find_map(|(k, v)| if key == *k { Some(v as *mut V) } else { None });
match found {
Some(v) => &mut *v,
None => {
this.push((key, val));
&mut this.last_mut().unwrap().1
}
}
}
}
The main points of safety revolve around the pointer to the value in found
. It started as a mutable reference, so we know that it was valid when we were iterating. We know that find_map
stops iterating on the first Some
, and we know that iterating using iter_mut()
shouldn't change our values anyway. Since this
cannot change between the binding of found
and the usage of it in the match
, I believe that this piece of code is safe.
It's always valuable to exercise your code through Miri. You must actually exercise the code, as Miri only flags code that causes undefined behavior, ignoring any dormant code paths. This code is Miri-clean:
fn main() {
let mut things = vec![(1, 2), (3, 4)];
let v = insert(&mut things, 1, 2);
println!("{} ({:p})", v, v);
let v = insert(&mut things, 1, 2);
println!("{} ({:p})", v, v);
let v = insert(&mut things, 5, 6);
println!("{} ({:p})", v, v);
let v = insert(&mut things, 5, 6);
println!("{} ({:p})", v, v);
}
2 (0x2829c)
2 (0x2829c)
6 (0x41054)
6 (0x41054)
Miri reports no issues for the same test code I used above, and I don't see anything obviously wrong.
When it's possible to avoid mem::transmute
, it generally should be avoided. transmute
is The Big Hammer and can do quite a lot of things that you might not intend (changing types is a key one). Using pointers feels much simpler in this case.
I agree with the usage of a comment to demonstrate why the unsafe code is safe. Even if it's wrong it still demonstrates the mindset of the original author. A future reviewer may be able to say "ah, they didn't think about checklist item #42, let me test that!".
Specifically for the reasoning in your comment, it's overly dense / academic to me. I don't see why there's talk about multiple lifetimes or double borrows.
Yes:
% cargo +nightly rustc --
Compiling example v0.1.0 (/private/tmp/example)
error[E0499]: cannot borrow `*this` as mutable more than once at a time
--> src/main.rs:8:16
|
2 | pub fn insert<'a, K: Eq, V>(this: &'a mut Vec<(K, V)>, key: K, val: V) -> &'a mut V {
| -- lifetime `'a` defined here
3 | for (key1, val1) in &mut *this {
| ---------- first mutable borrow occurs here
4 | if key == *key1 {
5 | return val1;
| ---- returning this value requires that `*this` is borrowed for `'a`
...
8 | let this = &mut *this;
| ^^^^^^^^^^ second mutable borrow occurs here
% cargo +nightly rustc -- -Zpolonius
Compiling example v0.1.0 (/private/tmp/example)
Finished dev [unoptimized + debuginfo] target(s) in 0.86s
% ./target/debug/example
2 (0x7f97ea405b24)
2 (0x7f97ea405b24)
6 (0x7f97ea405ba4)
6 (0x7f97ea405ba4)
See also:
- How to update-or-insert on a Vec?
- Double mutable borrow error in a loop happens even with NLL on
- Returning a reference from a HashMap or Vec causes a borrow to last beyond the scope it's in?
- When is it necessary to circumvent Rust's borrow checker?
这篇关于逻辑上拆分借用以解决启用 NLL 的借用检查器的限制是否安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!