我有一个带字符串字段的结构。我想控制如何分配字符串的内存。特别是,我想使用copy_arena
之类的东西来分配它们。
也许我可以定制一个ArenaString
类型,但是我不知道如何在反序列化代码中获得对Arena
的引用,并且假设这是可能的,那么我将不得不处理arena生存期,对吧?
最佳答案
下面是一个可能的实现,它使用serde::de::DeserializeSeed
将arena分配器公开给反序列化代码。
在更详细的用例中,您可能希望编写过程宏来生成这样的impl。
#[macro_use]
extern crate serde_derive;
extern crate copy_arena;
extern crate serde;
extern crate serde_json;
use std::fmt;
use std::marker::PhantomData;
use std::str;
use serde::de::{self, DeserializeSeed, Deserializer, MapAccess, Visitor};
use copy_arena::{Allocator, Arena};
#[derive(Debug)]
struct Jason<'a> {
one: &'a str,
two: &'a str,
}
struct ArenaSeed<'a, T> {
allocator: Allocator<'a>,
marker: PhantomData<fn() -> T>,
}
impl<'a, T> ArenaSeed<'a, T> {
fn new(arena: &'a mut Arena) -> Self {
ArenaSeed {
allocator: arena.allocator(),
marker: PhantomData,
}
}
fn alloc_string(&mut self, owned: String) -> &'a str {
let slice = self.allocator.alloc_slice(owned.as_bytes());
// We know the bytes are valid UTF-8.
str::from_utf8(slice).unwrap()
}
}
impl<'de, 'a> DeserializeSeed<'de> for ArenaSeed<'a, Jason<'a>> {
type Value = Jason<'a>;
fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
where
D: Deserializer<'de>,
{
static FIELDS: &[&str] = &["one", "two"];
deserializer.deserialize_struct("Jason", FIELDS, self)
}
}
impl<'de, 'a> Visitor<'de> for ArenaSeed<'a, Jason<'a>> {
type Value = Jason<'a>;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
formatter.write_str("struct Jason")
}
fn visit_map<A>(mut self, mut map: A) -> Result<Self::Value, A::Error>
where
A: MapAccess<'de>,
{
#[derive(Deserialize)]
#[serde(field_identifier, rename_all = "lowercase")]
enum Field { One, Two }
let mut one = None;
let mut two = None;
while let Some(key) = map.next_key()? {
match key {
Field::One => {
if one.is_some() {
return Err(de::Error::duplicate_field("one"));
}
one = Some(self.alloc_string(map.next_value()?));
}
Field::Two => {
if two.is_some() {
return Err(de::Error::duplicate_field("two"));
}
two = Some(self.alloc_string(map.next_value()?));
}
}
}
let one = one.ok_or_else(|| de::Error::missing_field("one"))?;
let two = two.ok_or_else(|| de::Error::missing_field("two"))?;
Ok(Jason { one, two })
}
}
fn main() {
let j = r#" {"one": "I", "two": "II"} "#;
let mut arena = Arena::new();
let seed = ArenaSeed::new(&mut arena);
let mut de = serde_json::Deserializer::from_str(j);
let jason: Jason = seed.deserialize(&mut de).unwrap();
println!("{:?}", jason);
}
如果arena分配不是一个严格的要求,您只需要在许多反序列化对象之间分摊字符串分配的成本,
Deserialize::deserialize_in_place
是一个更简洁的选择。// [dependencies]
// serde = "1.0"
// serde_derive = { version = "1.0", features = ["deserialize_in_place"] }
// serde_json = "1.0"
#[macro_use]
extern crate serde_derive;
extern crate serde;
extern crate serde_json;
use serde::Deserialize;
#[derive(Deserialize, Debug)]
struct Jason {
one: String,
two: String,
}
fn main() {
let j = r#" {"one": "I", "two": "II"} "#;
// Allocate some Strings during deserialization.
let mut de = serde_json::Deserializer::from_str(j);
let mut jason = Jason::deserialize(&mut de).unwrap();
println!("{:?} {:p} {:p}", jason, jason.one.as_str(), jason.two.as_str());
// Reuse the same String allocations for some new data.
// As long as the strings in the new datum are at most as long as the
// previous datum, the strings do not need to be reallocated and will
// remain at the same memory address.
let mut de = serde_json::Deserializer::from_str(j);
Jason::deserialize_in_place(&mut de, &mut jason).unwrap();
println!("{:?} {:p} {:p}", jason, jason.one.as_str(), jason.two.as_str());
// Do not reuse the string allocations.
// The strings here will not be at the same address as above.
let mut de = serde_json::Deserializer::from_str(j);
let jason = Jason::deserialize(&mut de).unwrap();
println!("{:?} {:p} {:p}", jason, jason.one.as_str(), jason.two.as_str());
}