问题描述
来自 Core Java ,第一卷。 1,第9版,p。 69:
From Core Java, vol. 1, 9th ed., p. 69:
String sentence = "ℤ is the set of integers"; // for clarity; not in book
char ch = sentence.charAt(1)
不返回空格但是第二个代码单位ℤ。
doesn't return a space but the second code unit of ℤ.
但似乎 sentence.charAt(1)
返回一个空格。例如,以下代码中的 if
语句的计算结果为 true
。
But it seems that sentence.charAt(1)
does return a space. For example, the if
statement in the following code evaluates to true
.
String sentence = "ℤ is the set of integers";
if (sentence.charAt(1) == ' ')
System.out.println("sentence.charAt(1) returns a space");
为什么?
我正在使用JDK SE 1.7.0_09在Ubuntu 12.10上,如果它是相关的。
I'm using JDK SE 1.7.0_09 on Ubuntu 12.10, if it's relevant.
推荐答案
听起来这本书说'ℤ'是不是中的UTF-16字符,但实际上它是。
It sounds like tho book is saying that 'ℤ' is not a UTF-16 character in the basic multilingual plane, but in fact it is.
对于不在基本多语言平面中的字符,Java使用带有代理项对的UTF-16。由于'ℤ'(0x2124)在基本多语言平面中,因此它由单个代码单元表示。在您的示例中 sentence.charAt(0)
将返回'ℤ',而 sentence.charAt(1)
将返回''。
Java uses UTF-16 with surrogate pairs for characters that are not in the basic multilingual plane. Since 'ℤ' (0x2124) is in the basic multilingual plane it is represented by a single code unit. In your example sentence.charAt(0)
will return 'ℤ', and sentence.charAt(1)
will return ' '.
由代理对代表的字符有两个代码单元组成字符。 sentence.charAt(0)
将返回第一个代码单元, sentence.charAt(1)
将返回第二个代码单元。
A character represented by surrogate pairs has two code units making up the character. sentence.charAt(0)
would return the first code unit, and sentence.charAt(1)
would return the second code unit.
参见:
这篇关于Java charAt用于具有两个代码单元的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!