问题描述
[更新#1]:我已将修改和修复的演示"项目上传到 https://github.com/sidshetye/SerializersCompare 如果其他人有兴趣查看基准测试.
[Update#1]: I've uploaded my modified and fixed "demo" project to https://github.com/sidshetye/SerializersCompare should anyone else be interested in checking out the benchmark.
[更新#2]:我看到 ProtoBufs 仅在后续迭代中领先.对于一次性序列化,BinaryFormatter 是一个数量级快的序列化.为什么?单独的问题...
[Update#2]:I'm seeing that ProtoBufs takes the order of magnitude lead only on subsequent iterations. For a one-time serialization, BinaryFormatter is the one which is an order of magnitude faster. Why? Separate question ...
我正在尝试比较 BinaryFormatter、Json.NET 和 ProtoBuf.NET(今天从 NuGet 中获取了后者).我发现 ProtoBuf 没有输出真正的字段,全是空值和 0(见下文).加上 BinaryFormatter 似乎要快得多.我基本上序列化 => 反序列化对象并比较
I'm trying to compare BinaryFormatter, Json.NET and ProtoBuf.NET (got the latter off NuGet today). I'm finding that ProtoBuf outputs no real fields, all nulls and 0s (see below). Plus BinaryFormatter appears to be FAR faster. I basically serialized => deserialized the object and compared
- 带有重新生成对象的原始图像
- 以字节为单位的大小
- 以毫秒为单位的时间
问题
- 如何让 ProtoBuf 真正吐出真实值,而不仅仅是(默认?)值?
- 我在速度方面做错了什么?我认为 ProtoBuf 应该是最快的序列化程序?
我从我的测试应用中得到的输出如下:
The output I got from my test app is below:
Json: Objects identical
Json in UTF-8: 180 bytes, 249.7054 ms
BinaryFormatter: Objects identical
BinaryFormatter: 512 bytes, 1.7864 ms
ProtoBuf: Original and regenerated objects differ !!
====Regenerated Object====
{
"functionCall": null,
"parameters": null,
"name": null,
"employeeId": 0,
"raiseRate": 0.0,
"addressLine1": null,
"addressLine2": null
}
ProtoBuf: 256 bytes, 117.969 ms
我的测试是在控制台应用程序中使用一个简单的实体(见下文).系统:Windows 8x64、VS2012 Update 1、.NET4.5.顺便说一下,我使用 [ProtoContract]
和 [ProtoMember(X)]
约定得到相同的结果.文档不清楚,但似乎 DataContract 是较新的统一"支持约定(对吗?)
My test was using a simple entity (see below) inside a console application. System: Windows 8x64, VS2012 Update 1, .NET4.5. By the way, I get the same result using the [ProtoContract]
and [ProtoMember(X)]
convention. Documentation isn't clear but it appears that DataContract is the newer 'uniformly' support convention (right?)
[Serializable]
[DataContract]
class SimpleEntity
{
[DataMember(Order = 1)]
public string functionCall {get;set;}
[DataMember(Order = 2)]
public string parameters { get; set; }
[DataMember(Order = 3)]
public string name { get; set; }
[DataMember(Order = 4)]
public int employeeId { get; set; }
[DataMember(Order = 5)]
public float raiseRate { get; set; }
[DataMember(Order = 6)]
public string addressLine1 { get; set; }
[DataMember(Order = 7)]
public string addressLine2 { get; set; }
public SimpleEntity()
{
}
public void FillDummyData()
{
functionCall = "FunctionNameHere";
parameters = "x=1,y=2,z=3";
name = "Mickey Mouse";
employeeId = 1;
raiseRate = 1.2F;
addressLine1 = "1 Disney Street";
addressLine2 = "Disneyland, CA";
}
}
这里有兴趣的人是我的 AllSerializers 类的 ProtoBufs 片段
For those interested here is the snippet of my AllSerializers class for ProtoBufs
public byte[] SerProtoBuf(object thisObj)
{
using (MemoryStream ms = new MemoryStream())
{
Serializer.Serialize(ms, thisObj);
return ms.GetBuffer();
}
}
public T DeserProtoBuf<T>(byte[] bytes)
{
using (MemoryStream ms = new MemoryStream())
{
ms.Read(bytes, 0, bytes.Count());
return Serializer.Deserialize<T>(ms);
}
}
推荐答案
首先,你的序列化/反序列化方法都坏了;您过度报告了结果(GetBuffer()
,没有Length
),并且您没有将任何内容写入流中以进行反序列化.这是一个正确的实现(尽管如果您返回 ArraySegment
,您也可以使用 GetBuffer()
):
Firstly, your serialize / deserialize methods are both broken; you are over-reporting the result (GetBuffer()
, without Length
), and you aren't writing anything into the stream for deserialization. Here's a correct implementation (although you could also use GetBuffer()
if you were returning ArraySegment<byte>
):
public static byte[] SerProtoBuf(object thisObj)
{
using (MemoryStream ms = new MemoryStream())
{
Serializer.NonGeneric.Serialize(ms, thisObj);
return ms.ToArray();
}
}
public static T DeserProtoBuf<T>(byte[] bytes)
{
using (MemoryStream ms = new MemoryStream(bytes))
{
return Serializer.Deserialize<T>(ms);
}
}
这就是为什么您没有取回数据的原因.其次,你没有说你是如何计时的,所以这里有一些我根据你的代码编写的(其中还包括代码来表明它正在取回所有值).结果第一:
That is why you are getting no data back. Secondly, you don't say how you are timing it, so here's some I've written based on your code (which also includes code to show that it is getting all the values back). Results first:
Via BinaryFormatter:
1 Disney Street
Disneyland, CA
1
FunctionNameHere
Mickey Mouse
x=1,y=2,z=3
1.2
Via protobuf-net:
1 Disney Street
Disneyland, CA
1
FunctionNameHere
Mickey Mouse
x=1,y=2,z=3
1.2
Serialize BinaryFormatter: 112 ms, 434 bytes
Deserialize BinaryFormatter: 113 ms
Serialize protobuf-net: 14 ms, 85 bytes
Deserialize protobuf-net: 19 ms
分析:
两个序列化器存储相同的数据;protobuf-net 的速度快了一个数量级,输出小了 5 倍.我宣布:赢家.
Both serializers stored the same data; protobuf-net was an order of magnitude faster, and a factor of 5 smaller output. I declare: winner.
代码:
static BinaryFormatter bf = new BinaryFormatter();
public static byte[] SerBinaryFormatter(object thisObj)
{
using (MemoryStream ms = new MemoryStream())
{
bf.Serialize(ms, thisObj);
return ms.ToArray();
}
}
public static T DeserBinaryFormatter<T>(byte[] bytes)
{
using (MemoryStream ms = new MemoryStream(bytes))
{
return (T)bf.Deserialize(ms);
}
}
static void Main()
{
SimpleEntity obj = new SimpleEntity(), clone;
obj.FillDummyData();
// test that we get non-zero bytes
var data = SerBinaryFormatter(obj);
clone = DeserBinaryFormatter<SimpleEntity>(data);
Console.WriteLine("Via BinaryFormatter:");
Console.WriteLine(clone.addressLine1);
Console.WriteLine(clone.addressLine2);
Console.WriteLine(clone.employeeId);
Console.WriteLine(clone.functionCall);
Console.WriteLine(clone.name);
Console.WriteLine(clone.parameters);
Console.WriteLine(clone.raiseRate);
Console.WriteLine();
data = SerProtoBuf(obj);
clone = DeserProtoBuf<SimpleEntity>(data);
Console.WriteLine("Via protobuf-net:");
Console.WriteLine(clone.addressLine1);
Console.WriteLine(clone.addressLine2);
Console.WriteLine(clone.employeeId);
Console.WriteLine(clone.functionCall);
Console.WriteLine(clone.name);
Console.WriteLine(clone.parameters);
Console.WriteLine(clone.raiseRate);
Console.WriteLine();
Stopwatch watch = new Stopwatch();
const int LOOP = 10000;
watch.Reset();
watch.Start();
for (int i = 0; i < LOOP; i++)
{
data = SerBinaryFormatter(obj);
}
watch.Stop();
Console.WriteLine("Serialize BinaryFormatter: {0} ms, {1} bytes", watch.ElapsedMilliseconds, data.Length);
watch.Reset();
watch.Start();
for (int i = 0; i < LOOP; i++)
{
clone = DeserBinaryFormatter<SimpleEntity>(data);
}
watch.Stop();
Console.WriteLine("Deserialize BinaryFormatter: {0} ms", watch.ElapsedMilliseconds, data.Length);
watch.Reset();
watch.Start();
for (int i = 0; i < LOOP; i++)
{
data = SerProtoBuf(obj);
}
watch.Stop();
Console.WriteLine("Serialize protobuf-net: {0} ms, {1} bytes", watch.ElapsedMilliseconds, data.Length);
watch.Reset();
watch.Start();
for (int i = 0; i < LOOP; i++)
{
clone = DeserProtoBuf<SimpleEntity>(data);
}
watch.Stop();
Console.WriteLine("Deserialize protobuf-net: {0} ms", watch.ElapsedMilliseconds, data.Length);
}
最后,[DataMember(...)]
支持并不是真正的较新的‘统一’支持约定"——它当然不是较新的"——我很确定自从像 commit #4(可能更早)这样的事情以来,已经支持这两个.这只是为了方便而提供的选项:
Lastly, [DataMember(...)]
support isn't really the "newer 'uniformly' support convention" - it certainly isn't "newer" - I'm pretty sure it has supported both of those since something like commit #4 (and possibly earlier). It is just options provided for convenience:
- 并非所有目标平台都有
DataMemberAttribute
- 有些人更喜欢将 DTO 层限制为内置标记
- 某些类型在很大程度上不受您的控制,但可能已经具有这些标记(例如,从 LINQ-to-SQL 生成的数据)
- 另外,请注意 2.x 允许您在运行时定义模型而无需添加属性(尽管属性仍然是最方便的方式)
这篇关于即使对于简单实体,ProtoBuf 序列化也缺少数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!