问题描述
在 python 和 nodejs 之间序列化 protobuf 消息时,我遇到了兼容问题.我有一条如下所示的 protobuf 消息:
message User {保留 2,3;字符串 user_id = 1;int32 硬币 = 4;int32 exp = 5;int32 宝石 = 6;int32 级别 = 7;}
我想序列化一个消息实例,如:
"userId": "3562957934"硬币":350经验":1宝石":301级
当我做 user_pb2.User.SerializeToString()\x0a\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xde\x02\x28\x01\x30\x1e\x38\x01
或二进制
1101 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000 0001
当我尝试在 nodejs 中反序列化此消息时,我得到 p>
"userId": "3562957934"硬币":381经验":1宝石":301级
其中的硬币"值错误
然后我尝试创建一个消息实例(硬币值 = 350)并在 nodejs 中反序列化它.我得到一个不同的二进制文件:\x5c\x0a\x5c\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xc3\x9e\x02\x28\x01\x30\x1e\x38\x01
或二进制:
1100 0011 1001 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000
p>
我发现除了头部\x0a\x0a
和\x5c\x0a\x5c\x0a
的奇怪字节之外,python和nodejs序列化之间的主要区别是字节 1101 1110
(python) vs 1100 0011 1001 1110
(nodejs),或字符串形式 3562957934 (08
(python) vs 3562957934 Þ(08 (nodejs)
我的协议是:/usr/local/bin/protoc -I=protos user.proto --python_out=pb
(python)/usr/local/bin/protoc --js_out=import_style=commonjs,binary:protos user.proto -I=protos
(nodejs)
我想给定相同的消息,python 和 nodejs 的序列化应该是相同的,不是吗?我尝试搜索谷歌官方 protobuf 文档,仍然找不到解决方案.有没有人遇到过同样的问题?
在传递序列化 blob 时,您似乎遇到了某种 UTF-8 编码问题.原始序列化字节(来自 Python)中有一个字节 0xDE
,但您引用的 node.js 版本有 0xC3 0x9E
,这是 UTF-8 编码Unicode 代码点 U+00DE.
为了安全起见,我建议您使用 ASCII 安全编码(例如 base64)来传递 blob 以进行调试.一旦成功,您就可以确保以二进制模式打开所有相关文件和流.
I got a compatible problem when serialize a protobuf message between python and nodejs. I have a protobuf message like the one below:
message User {
reserved 2,3;
string user_id = 1;
int32 coin = 4;
int32 exp = 5;
int32 gem = 6;
int32 level = 7;
}
i would like to serialize a message instance like:
"userId": "3562957934"
"coin": 350
"exp": 1
"gem": 30
"level": 1
when I do user_pb2.User.SerializeToString()\x0a\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xde\x02\x28\x01\x30\x1e\x38\x01
or in binary
1101 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000 0001
when I try to deserialize this message in nodejs, I get
"userId": "3562957934"
"coin": 381
"exp": 1
"gem": 30
"level": 1
which has a wrong "coin" value
then I try to create a message instance (with coin value = 350) and deserialize it in nodejs. I get a different binary:\x5c\x0a\x5c\x0a\x33\x35\x36\x32\x39\x35\x37\x39\x33\x34\x20\xc3\x9e\x02\x28\x01\x30\x1e\x38\x01
or in binary:
1100 0011 1001 1110 0000 0010 0010 1000 0000 0001 0011 0000 0001 1110 0011 1000 0000 0001
I found that beside the strange bytes of the head \x0a\x0a
and \x5c\x0a\x5c\x0a
the main different between the python and nodejs serialization is the byte 1101 1110
(python) vs 1100 0011 1001 1110
(nodejs), or in string form 3562957934 �(08
(python) vs 3562957934 Þ(08
(nodejs)
my protoc are:/usr/local/bin/protoc -I=protos user.proto --python_out=pb
(python)/usr/local/bin/protoc --js_out=import_style=commonjs,binary:protos user.proto -I=protos
(nodejs)
I suppose that given a same message, serialization of python and nodejs should be the same, didn't it?I tried searching for google official protobuf documents, still cannot found a solution. Does anyone have come across a same problem?
It looks like you have some sort of UTF-8 encoding problem when passing around the serialized blobs. The original serialized bytes (from Python) have a byte 0xDE
in them, but the node.js version you quote has 0xC3 0x9E
instead, which is the UTF-8 encoding of the Unicode code point U+00DE.
I suggest you use an ASCII-safe encoding such as base64 to pass around the blobs for debugging purposes, just to be on the safe side. Once that works, you can make sure that you open all the relevant files and streams in binary mode.
这篇关于python和nodejs之间使用protobuf的序列化问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!