问题描述
Heyho,
我想将字节数据(可以是任何东西)转换为字符串。我的问题是,使用UTF-8编码字节数据是否安全,例如:
String s1 = new String(data,UTF-8);
或使用base64:
String s2 = Base64.encodeToString(data,false); // migbase64
我恐怕使用第一种方法有负面的副作用。我的意思是两个变体工作p̶e̶r̶f̶e̶c̶t̶l̶y̶,但可以包含任何字符的UTF-8字符集, 只使用可读字符。我只是不知道如果它真的需要使用base64。基本上我只需要创建一个字符串发送它通过网络,并再次接收它。 (在我的情况没有其他方式:/)
问题只是关于负面副作用,而不是可能的! p>
您应该绝对使用base64或可能是hex。 (或者将工作; base64更紧凑,但更难阅读。)
你声称两个变体都工作完美,但实际上不是真的。如果使用第一种方法,并且 data
实际上不是有效的UTF-8序列,您将丢失数据。您不是试图将UTF-8编码的文本转换为 String
,因此不要编写尝试这样做的代码。 / p>
使用 ISO-8859-1
作为编码将保留所有数据 - 但在很多情况下,返回将不容易跨其他协议传输。
只能使用 String(byte [],String)
构造函数,当你有固有的文本数据,你碰巧以编码形式(其中编码被指定为第二个参数)。对于任何其他 - 音乐,视频,图像,加密或压缩数据,只是例如 - 你应该使用一种方法,将传入的数据视为任意二进制数据,并找到它的文本编码...这正是什么base64和十六进制。
Heyho,
I want to convert byte data, which can be anything, to a String. My question is, whether it is "secure" to encode the byte data with UTF-8 for example:
String s1 = new String(data, "UTF-8");
or by using base64:
String s2 = Base64.encodeToString(data, false); //migbase64
I'm just afraid that using the first method has negative side effects. I mean both variants work p̶e̶r̶f̶e̶c̶t̶l̶y̶ , but s1 can contain any character of the UTF-8 charset, s2 only uses "readable" characters. I'm just not sure if it's really need to use base64. Basically I just need to create a String send it over the network and receive it again. (There is no other way in my situation :/)
The question is only about negative side effects, not if it's possible!
You should absolutely use base64 or possibly hex. (Either will work; base64 is more compact but harder for humans to read.)
You claim "both variants work perfectly" but that's actually not true. If you use the first approach and data
is not actually a valid UTF-8 sequence, you will lose data. You're not trying to convert UTF-8-encoded text into a String
, so don't write code which tries to do so.
Using ISO-8859-1
as an encoding will preserve all the data - but in very many cases the string that is returned will not be easily transported across other protocols. It may very well contain unprintable control characters, for example.
Only use the String(byte[], String)
constructor when you've got inherently textual data, which you happen to have in an encoded form (where the encoding is specified as the second argument). For anything else - music, video, images, encrypted or compressed data, just for example - you should use an approach which treats the incoming data as "arbitrary binary data" and finds a textual encoding of it... which is precisely what base64 and hex do.
这篇关于将字节[]编码为字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!