问题描述
Java char
是(最大大小为65,536),但有 Unicode字符。这是否意味着您不能处理Java应用程序中的某些Unicode字符?
A Java char
is 2 bytes (max size of 65,536) but there are 95,221 Unicode characters. Does this mean that you can't handle certain Unicode characters in a Java application?
这是否归结为你使用什么字符编码?
Does this boil down to what character encoding you are using?
推荐答案
Java的 char
是一个。对于代码点> 0xFFFF的字符,将编码为2 char
s(代理对)。
Java's char
is a UTF-16 code unit. For characters with code-point > 0xFFFF it will be encoded with 2 char
s (a surrogate pair).
请参见,了解如何处理Java中的这些字符。
See http://www.oracle.com/us/technologies/java/supplementary-142654.html for how to handle those characters in Java.
(BTW,在Unicode 5.2中,有1,114,112个插槽中有107,154个字符。 )
(BTW, in Unicode 5.2 there are 107,154 assigned characters out of 1,114,112 slots.)
这篇关于Java Unicode编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!