本文介绍了Java西里尔文编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我输入了字符串- UAH;Ãîëüô855229-7 ,它应该显示为 UAH;Гольф855229-7 ,我尝试使用 Cp1251 编码,但输出为 UAH; ??????? 855229-7

I have input string - UAH;"Ãîëüô 855229-7", it should be displayed like UAH;"Гольф 855229-7", I'm trying to use Cp1251 encoding, but get output UAH;"????? 855229-7".

String cyrillic = row[0] + row[1];
String utf8String= new String(cyrillic.getBytes("Cp1251"), "UTF-8");
lbl1.setText(utf8String);


推荐答案

UTF-8与这无关。您在西里尔字母中的所有字符都表示为单个字节。

UTF-8 has nothing to do with this. All of your characters in cyrillic are being represented as single bytes.

当前,这些字节在ISO 8859中-1编码,也称为Latin-1,它是Windows英语代码页Cp1252的子集。因此,您要将字符串编码为Cp1252,然后将结果字节解码为Cp1251:

Currently, those bytes are in the ISO 8859-1 encoding, also known as Latin-1, which is a subset of the Windows English code page, Cp1252. So, you want to encode the string as Cp1252, then decode the resulting bytes as Cp1251:

String corrected8String = new String(cyrillic.getBytes("Cp1252"), "Cp1251");

这篇关于Java西里尔文编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 09:40