问题描述
在Python 2中,函数json.dumps()
将确保所有非ASCII字符都被转义为\uxxxx
.
In Python 2, the function json.dumps()
will ensure that all non-ascii characters are escaped as \uxxxx
.
但这不是很令人困惑,因为\uxxxx
是Unicode字符,应在Unicode字符串内使用.
But isn't this quite confusing because \uxxxx
is a unicode character and should be used inside a unicode string.
json.dumps()
的输出是一个str
,它是Python 2中的字节字符串.因此,它不应该转义\xhh
这样的字符吗?
The output of json.dumps()
is a str
, which is a byte string in Python 2. And thus shouldn't it escape characters as \xhh
?
>>> unicode_string = u"\u00f8"
>>> print unicode_string
ø
>>> print json.dumps(unicode_string)
"\u00f8"
>>> unicode_string.encode("utf8")
'\xc3\xb8'
推荐答案
这就是重点.您会得到一个字节字符串,而不是Unicode字符串.因此,Unicode字符需要转义才能生存. JSON允许转义,因此提供了一种表示Unicode字符的安全方式.
That's exactly the point. You get a byte string back, not a Unicode string. Thus the Unicode characters need to be escaped to survive. The escaping is allowed by JSON and thus presents a safe way of representing Unicode characters.
这篇关于为什么json.dumps使用"\ uxxxx"转义非ascii字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!