问题描述
我输入了字符串- UAH;Ãîëüô855229-7
,它应该显示为 UAH;Гольф855229-7
,我尝试使用 Cp1251
编码,但输出为 UAH; ??????? 855229-7
。
I have input string - UAH;"Ãîëüô 855229-7"
, it should be displayed like UAH;"Гольф 855229-7"
, I'm trying to use Cp1251
encoding, but get output UAH;"????? 855229-7"
.
String cyrillic = row[0] + row[1];
String utf8String= new String(cyrillic.getBytes("Cp1251"), "UTF-8");
lbl1.setText(utf8String);
推荐答案
UTF-8与这无关。您在西里尔字母
中的所有字符都表示为单个字节。
UTF-8 has nothing to do with this. All of your characters in cyrillic
are being represented as single bytes.
当前,这些字节在ISO 8859中-1编码,也称为Latin-1,它是Windows英语代码页Cp1252的子集。因此,您要将字符串编码为Cp1252,然后将结果字节解码为Cp1251:
Currently, those bytes are in the ISO 8859-1 encoding, also known as Latin-1, which is a subset of the Windows English code page, Cp1252. So, you want to encode the string as Cp1252, then decode the resulting bytes as Cp1251:
String corrected8String = new String(cyrillic.getBytes("Cp1252"), "Cp1251");
这篇关于Java西里尔文编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!