问题描述
我有一个字符串,它在 Javascript 中序列化为 JSON,然后反序列化为 Java.
I have a string which gets serialized to JSON in Javascript, and then deserialized to Java.
看起来如果字符串包含度数符号,那么我就会遇到问题.
It looks like if the string contains a degree symbol, then I get a problem.
我可以借助一些帮助来找出该怪谁:
I could use some help in figuring out who to blame:
- 是 Spidermonkey 1.8 的实现吗?(这有一个内置的 JSON 实现)
- 是 Google gson 吗?
- 是我做错了什么吗?
以下是 JSDB 中发生的事情:
Here's what happens in JSDB:
js>s='15u00f8C'
15°C
js>JSON.stringify(s)
"15°C"
我本来期望 "15u00f8C'
这让我相信 Spidermonkey 的 JSON 实现没有做正确的事情......除了 JSON 主页的语法描述(这是规范吗?)说一个字符可以是
I would have expected "15u00f8C'
which leads me to believe that Spidermonkey's JSON implementation isn't doing the right thing... except that the JSON homepage's syntax description (is that the spec?) says that a char can be
any-Unicode-character-除了-"-或--或-控制字符"
所以也许它按原样传递字符串而不将其编码为 u00f8...在这种情况下,我认为问题出在 gson 库上.
so maybe it passes the string along as-is without encoding it as u00f8... in which case I would think the problem is with the gson library.
有人可以帮忙吗?
我想我的解决方法是使用不同的 JSON 库,或者在调用 JSON.stringify()
后自己手动转义字符串——但如果这是一个错误,那么我想提交错误报告.
I suppose my workaround is to use either a different JSON library, or manually escape strings myself after calling JSON.stringify()
-- but if this is a bug then I'd like to file a bug report.
推荐答案
这不是任何一个实现中的错误.不需要转义 U+00B0.引用 RFC:
This is not a bug in either implementation. There is no requirement to escape U+00B0. To quote the RFC:
2.5.字符串
字符串的表示是类似于 C 中使用的约定编程语言家族.一个字符串以引号开始和结束分数.所有 Unicode 字符都可能是放在引号内除了必须是的字符转义:引号,反向实线和控制字符(U+0000 到 U+001F).
The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).
任何字符都可以被转义.
对所有内容进行转义会增加数据的大小(在所有 Unicode 转换格式中,所有代码点都可以用四个或更少的字节表示;而对它们全部进行编码会使它们变成六个或十二个字节).
Escaping everything inflates the size of the data (all code points can be represented in four or fewer bytes in all Unicode transformation formats; whereas encoding them all makes them six or twelve bytes).
更有可能的是,您的代码中某处存在文本转码错误,并且转义 ASCII 子集中的所有内容掩盖了问题.JSON 规范要求所有数据都使用 Unicode 编码.
It is more likely that you have a text transcoding bug somewhere in your code and escaping everything in the ASCII subset masks the problem. It is a requirement of the JSON spec that all data use a Unicode encoding.
这篇关于JSON 和转义字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!