东西比更好的Base64

东西比更好的Base64

本文介绍了二进制数据的JSON字符串。东西比更好的Base64的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

借助本身不支持二进制数据。二进制数据来转义,以便它可以被放置到一个字符串元素(即零个或更多的Uni code字符用反斜杠双引号)的JSON。

逸出的二进制数据的一个明显的方法是使用的Base64。然而,Base64编码具有很高的处理开销。还它扩展3字节到4个字符其中由33%左右导致增加的数据大小。

一个用例,这是的V0.8草案。创建使用JSON通过REST的web服务数据对象,例如

  PUT / myContainer中/ BinaryObject HTTP / 1.1
主持人:cloud.example.com
接受:应用/ vnd.org.snia.cdmi.dataobject + JSON
内容类型:应用程序/ vnd.org.snia.cdmi.dataobject + JSON
的X CDMI规格,版本:1.0
{
    mime类型:应用程序/八位字节流
    元数据:[],
    值:TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz
    IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg
    dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmF​​uY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu
    dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo
    ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4 =,
}

有没有更好的方法和标准方法EN code二进制数据成JSON字符串?


解决方案

有哪些可以,如果你的JSON作为UTF传输重新根据JSON规范(psented为一个字节$ P $ 94的Uni code字-8)。考虑到这一点,我觉得你可以做的空间明智的,最好是从而重新presents四个字节作为五个大字。然而,这仅比的base64 7%的提升,它的计算更加昂贵,并且实现比对的base64不太常见,因此可能不是一个胜利。

您也可以简单地每个输入字节到U + 0000-U + 00FF对应的字符映射表,然后做了JSON标准来传递这些字符所需的最小编码;这里的优点是,所要求的解码是零以外内建函数,但空间效率差 - 一个105%的膨胀(若所有输入的字节是等可能的)与对于base85 25%或为的base64 33%

终审判决:BASE64胜,在我看来,理由是它是常见的,容易的,而不是坏的足够的,以保证更换

The JSON format natively doesn't support binary data. The binary data has to be escaped so that it can be placed into a string element (i.e. zero or more Unicode chars in double quotes using backslash escapes) in JSON.

An obvious method to escape binary data is to use Base64. However, Base64 has a high processing overhead. Also it expands 3 bytes into 4 characters which leads to an increased data size by around 33%.

One use case for this is the v0.8 draft of the CDMI cloud storage API specification. You create data objects via a REST-Webservice using JSON, e.g.

PUT /MyContainer/BinaryObject HTTP/1.1
Host: cloud.example.com
Accept: application/vnd.org.snia.cdmi.dataobject+json
Content-Type: application/vnd.org.snia.cdmi.dataobject+json
X-CDMI-Specification-Version: 1.0
{
    "mimetype" : "application/octet-stream",
    "metadata" : [ ],
    "value" :   "TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz
    IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg
    dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu
    dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo
    ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=",
}

Are there better ways and standard methods to encode binary data into JSON strings?

解决方案

There are 94 Unicode characters which can be represented as one byte according to the JSON spec (if your JSON is transmitted as UTF-8). With that in mind, I think the best you can do space-wise is base85 which represents four bytes as five characters. However, this is only a 7% improvement over base64, it's more expensive to compute, and implementations are less common than for base64 so it's probably not a win.

You could also simply map every input byte to the corresponding character in U+0000-U+00FF, then do the minimum encoding required by the JSON standard to pass those characters; the advantage here is that the required decoding is nil beyond builtin functions, but the space efficiency is bad -- a 105% expansion (if all input bytes are equally likely) vs. 25% for base85 or 33% for base64.

Final verdict: base64 wins, in my opinion, on the grounds that it's common, easy, and not bad enough to warrant replacement.

这篇关于二进制数据的JSON字符串。东西比更好的Base64的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 07:00