问题描述
我正在尝试将 u32
s的 Vec
转换为 Vec
的 u8
s,最好是就地且没有太多开销。
I'm trying to convert a Vec
of u32
s to a Vec
of u8
s, preferably in-place and without too much overhead.
我当前解决方案依靠不安全的代码来重构 Vec
。有没有更好的方法可以做到这一点,与我的解决方案相关的风险是什么?
My current solution relies on unsafe code to re-construct the Vec
. Is there a better way to do this, and what are the risks associated with my solution?
use std::mem;
use std::vec::Vec;
fn main() {
let mut vec32 = vec![1u32, 2];
let vec8;
unsafe {
let length = vec32.len() * 4; // size of u8 = 4 * size of u32
let capacity = vec32.capacity() * 4; // ^
let mutptr = vec32.as_mut_ptr() as *mut u8;
mem::forget(vec32); // don't run the destructor for vec32
// construct new vec
vec8 = Vec::from_raw_parts(mutptr, length, capacity);
}
println!("{:?}", vec8)
}
推荐答案
-
每当编写
不安全的
区块,我强烈鼓励人们在区块上添加注释,以解释为什么您认为代码实际上是安全的。这类信息对以后阅读代码的人很有用。
Whenever writing an
unsafe
block, I strongly encourage people to include a comment on the block explaining why you think the code is actually safe. That type of information is useful for the people who read the code in the future.
与其添加有关幻数 4的注释,不如使用 mem :: size_of ::< u32>
。我什至甚至将 size_of
用于 u8
并进行除法以得到最大的清晰度。
Instead of adding comments about the "magic number" 4, just use mem::size_of::<u32>
. I'd even go so far as to use size_of
for u8
and perform the division for maximum clarity.
您可以从不安全
块中返回新创建的Vec。
You can return the newly-created Vec from the unsafe
block.
如评论中所述,倾销这样的数据块使数据格式与平台无关。在小端和大端系统上,您会得到不同的答案。将来可能会导致大量的调试麻烦。文件格式要么将平台字节序编码到文件中(使读者的工作更加困难),要么仅将特定字节序写入文件中(使作者的工作更加困难)。
As mentioned in the comments, "dumping" a block of data like this makes the data format platform dependent; you will get different answers on little endian and big endian systems. This can lead to massive debugging headaches in the future. File formats either encode the platform endianness into the file (making the reader's job harder) or only write a specific endinanness to the file (making the writer's job harder).
我可能会将整个不安全
块移到一个函数上并为其命名,仅出于组织目的。
I'd probably move the whole unsafe
block to a function and give it a name, just for organization purposes.
您不需要导入 Vec
,这是前奏。
You don't need to import Vec
, it's in the prelude.
use std::mem;
fn main() {
let mut vec32 = vec![1u32, 2];
// I copy-pasted this code from StackOverflow without reading the answer
// surrounding it that told me to write a comment explaining why this code
// is actually safe for my own use case.
let vec8 = unsafe {
let ratio = mem::size_of::<u32>() / mem::size_of::<u8>();
let length = vec32.len() * ratio;
let capacity = vec32.capacity() * ratio;
let ptr = vec32.as_mut_ptr() as *mut u8;
// Don't run the destructor for vec32
mem::forget(vec32);
// Construct new Vec
Vec::from_raw_parts(ptr, length, capacity)
};
println!("{:?}", vec8)
}
我对此代码最大的未知之忧在于与 Vec
关联的内存对齐。
My biggest unknown worry about this code lies in the alignment of the memory associated with the Vec
.
Rust的基础分配器和内存特定的 。 布局
包含诸如指针的 size 和 alignment 之类的信息。
Rust's underlying allocator allocates and deallocates memory with a specific Layout
. Layout
contains such information as the size and alignment of the pointer.
我假设这段代码需要 Layout
来匹配对的配对调用之间的匹配alloc
和 dealloc
。在这种情况下,,因为该信息是。
I'd assume that this code needs the Layout
to match between paired calls to alloc
and dealloc
. If that's the case, dropping the Vec<u8>
constructed from a Vec<u32>
might tell the allocator the wrong alignment since that information is based on the element type.
没有更好的知识,要做的最好的事情就是离开 Vec< u32>
并按原样获得& [u8]
。切片与分配器没有交互,从而避免了这个问题。
Without better knowledge, the "best" thing to do would be to leave the Vec<u32>
as-is and simply get a &[u8]
to it. The slice has no interaction with the allocator, avoiding this problem.
即使不与分配器进行交互,您也需要小心对齐!
Even without interacting with the allocator, you need to be careful about alignment!
另请参见:
- How to slice a large Vec<i32> as &[u8]?
- https://stackoverflow.com/a/48309116/155423
这篇关于转换Vec< u32>到Vec< u8>就地且开销最小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!